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Description 



[0001] The present invention is involved in a method for the identification and detection of organisms using the 
sequences of their genes encoding the B subunit of the DNA gyrase. 
5 [0002] This invention is useful in medical fields as well as various industrial fields where the identification/classification 
or detection/monitoring of specific microorganisms (bacteria, yeasts, fungi, archaea and bacteria), especially bacteria, 
is necessary. 

[0003] Conventionally, the identification/classification of living organisms has been carried out using the combination 
of biochemical and morphological tests. However, these tests often did not provide unequivocal answers to the taxo 

10 nomic positions of tested organisms. 

[0004] Recently, the taxonomy of organisms, in particular of bacteria, using rRNA sequences became fashionable. 
There are many reasons why rRNA molecules have been selected as standard molecules for the molecular taxonomy. 
They are constituents of all organisms. They exist in abundance, and therefore, can readily be isolated and character- 
ized. For sequence comparison, many conserved regions of rRNA molecules allowed the alignment between distantly 

is related organisms, while variable regions are useful for the distinction of closely related organisms (van de Peer, Y, S. 
Chapelies, and R. de Wacher. 1996. A quantitative map of nucleotide substitution rates in bacterial rRNA. Nucleic Acids 
Res. 24: 3381-3391; and Gutell, R. R., N. Larsen, and C. R. Woese. 1994. Lessons from an evolving rRNA: 16S and 
23S rRNA structures from a comparative perspective. Microbiol. Rev. 58: 10-26). Furthermore, there is a few evidence 
for the horizontal transfer of rRNA genes although many other genes are expected to have frequently been transferred 

20 from one species to other distantly related species. At present rRNA sequences are accumulating rapidly and they are 
accessible via an international database (Ribosomal Database project, http://rcfc.lrfe.uiuc.edu/). 
[0005] However, as is clear from the fact that the evolution speed of rRNA genes is extremely slow, there is little dif- 
ference in the rRNA sequences between closely related organisms. Therefore, in many times, species belonging to the 
same genus could not be discriminated by the analysis using rRNA sequences. For example, it is said that bacteria 

25 sharing more than 97 % of identity in their 16S rRNA sequences (bacterial small subunit rRNA) might belong to the 
same species. However, there are cases of bacteria exhibiting more than 99 % identity in their 16S rRNA sequences, 
and yet belonging to two distinct species as revealed from DNA hybridization analysis Evidently, due to the slow speed 
of divergent evolution of the 1 6S rRNA gene, the resolution of 1 6S rRN A-based analysis between closely related organ- 
isms is lower than that of DNA hybridization analysis (Stackebrandt. E. and Goebel, B. M. 1994. Taxonomic note: a 

30 place for DNA-DNA reassociation and 1 6S rRNA sequence analysis in the present species difinrtton in bacteriology. Int. 
J. Syst. Bacteriol. 37: 463-464). 

[0006] Other problems exist in the rRNA-based phytogenetic analysis. To establish a phylogenetic relationship based 
on rRNA sequences, these sequences should be aligned. The alignment of rRNA sequences composed from four dif- 
ferent constituents (AUCG), however, is not easy, and requires some expertise. The correct sequencing of rRNA genes 
35 is also difficult largely due to their highly ordered structure. Furthermore, polymorphism of rRNA was found in some 
organisms. 

[0007] In contrast protein-encoding genes have evolved more rapidly than rRNA-encocTmg genes, since they allow 
the so-called neutral mutations that do not cause any amino acid substitutions in their gene products. It is then expected 
that, by using such protein-encoding genes, more precise phylogenetic analysis can be performed than by using rRNA 

40 sequences. Thus, the present inventors have developed and applied a method for the identification/classification or 
detection/monitoring of organisms using the sequences of gyrB genes encoding the B subunit of DNA gyrases 
(Yamamoto, S. and Harayama, S. 1995. PCR Amplification and Direct Sequencing of gyrB Genes with Universal Prim- 
ers and Their Application to the Detection and Taxonomic Analysis of Pseudomonas putkia Strains. Appl. Environ. 
Microbiol. 61: 1104-1109; Yamamoto, S. and Harayama, S. 1996. Phylogenetic Analysis of Adnetobacter Strains 

45 Based on the Nucleotide Sequences of gyrB Genes and on the Amino acid Sequences of Their Products, int. J. Syst. 
Bacteriol. 46: 506-511; Yamamoto, S. and Harayama, S. 1998. Phylogenetic relationships of Pseudomonas putida 
strains deduced from the nucleotide sequences of gyrB, rpoD and 16S rRNA genes. Int. J. Syst. Bacteriol. 48: 813- 
819; Yamamoto, S., Bouvet, P. J M. & Harayama, S. 1998. Phylogenetic structures of the genus Adnetobacter based 
on the gyrB sequences: Comparison with the grouping by DNA-DNA hybridization. Int. J. Syst. Bacteriol. (in press); 

so Harayama, S. and Yamamoto, S. 1996. Phylogenetic Identification of Pseudomonas Strains Based on a Comparison of 
gyrB and rpoD Sequences, p. 250-258 in Molecular Biology of Pseudomonads, edited by T. Nakazawa, K. Furukawa, 
D. Haas, S. Silver. ASM Press, Washington, D.C.; and Watanabe, K., temamoto, S. Hino, S. and Harayama, S. 1998. 
Population dynamics of phenol-degrading bacteria in activated sludge determined by gyrB -targeted quantitative PCR. 
Appl. Environ. Microbiol. 64: 1203-1209). 

55 [0008] DNA topoisomerases are essential for the replication, transcription, recombination and repair of DNA and con- 
trol the level of supercoiling of DNA molecules by cleaving and reseating the phosphodi ester bond of DNA. They are 
classified into type I (EC 5.99. 1.2) and type II (EC 5.99.1.3) according to their enzymatic properties. The DNA gyrase 
is a type II topoisomerase that is capable of introducing negative supercoiling into a relaxed closed circular DNA mole- 
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cuie This reaction is coupled with ATP hydrolysis. DNA gyrase can also relax supercoiled DNA without ATP hydrolysis. 
DNA gyrase consists of two subunrt proteins in the quaternary structure of A2B2. The A subunrt (GyrA) has a rrolecular 
weight of approximately 1 00 kDa while the B subunit (GyrB) has a molecular weight of either 90 kDa or 70 kDa (Wigley, 
D B 1995 Structure and mechanism of DNA topoisomerases. Ann. Rev. Biomol. Struct. 24: 185-208). The genes for 
DNA gyrase or its isofunctional enzymes should exist in all organisms as they are indispensable for the cell proliferation. 
[0009] As described above, the present inventors have already developed and applied successfully the method for 
the identification/classification or detection/monitoring of organisms using gyrB sequences. In this method, a gyrB gene 
fragment of an organism of interest is amplified by PCR using primers designed from the two amino acid sequences, 
His-Ala-GJy-Gly-Lys-Phe-Asp and Met-Thr-Asp-Ala-Asp-Val-Asp-Gly. which are highly conserved among the GyrB 
sequences of many organisms. Subsequently, the amplified fragments are subjected to direct sequencing. Since the 
gyrB genes code for proteins, they have frequently undergone neutral mutations. Thus, the nucleotide sequence of the 
gyrB genes vary considerable even among related organisms. For this reason, the above method has been shown to 
be effective for discriminating organisms at a level of species of subspecies. The above-mentioned PCR pnmers 
designed from the highly conserved amino acid sequences of GyrB were effective in many but not all bacterial species 
for the PCR amplification of gyrB. From DNA of some bacterial species, no PCR amplification was observed using 
these primers. t 
[0010] Besides, there was another problem associated with these primers. The genes for type IV topoisomerese 
{parE) were also amplified from DNA of some bacterial species by using these primers. Topoisomerase IV (ParE) is a 
bacterial enzyme that appears to be closely related to DNA gyrase. This enzyme involves in the partition of chromo- 
somes into daughter cells. If a parE gene but not gyrB gene is amplified from a DNA, and if a phylogenetic analysis is 
carried out without recognizing that the amplified sequence is parE but not gyrB % it will bring some confusion to the phy- 
logenetic analysis. To avoid such problem associated with the amplification of paralogous genes, primers which do not 
amplify parE should be developed. 

[001 1] It is an object of the present invention to solve the above-described problems of the primers and to provide a 
25 means which enables the identification/classification and detection/monitoring of a wide range of organisms using gyrB 
sequences. 

[001 2] Comparing the amino acid sequence data of GyrB collected by the inventors with those of ParE, the inventors 
have found a plurality of the amino acid sequences of GyrB which are appropriate for designing PCR primers capable 
of specifically amplifying gyrB genes. By using the newly designed PCR primers in combination of the primers men- 
30 tioned in the section of BACKGROUND OF THE INVENTION, it became possible to determine gyrB sequences more 
easily and precisely from a wider range of organisms. The present invention has been achieved based on the above- 
described findings. 

[001 3] The present invention relates to a method for the identification and detection of organisms using nucleotide 
sequences amplified by using two primers, at least one of the primers being an oligonucleotide which codes for ail or a 
35 part of one of the following amino acid sequences (a) through (I), 

(a) (Pro or Ser)-(Ala or Thr)-(Ala, Val or Leu)-(Glu or Asp)-(Va1 or Thr)-(lle or Val)-(Met. Leu or Phe)-Thr-(Val. Gin 
or lle)-Leu-His-Ala-Gly-Gly-Lys-Phe-(Asp or Gly)-(Asp, Gly. Asn or Ser)-(Ser, Lys. Gly, Asp or Asn) 

(b) Gly-Gly-Thr-His 

40 (c) (lie or Leu)-Met-Thr-Asp- Ala-Asp- Val-Asp-Gly-(Ala or Ser)-His-lle-Arg-Thr-Leu 

(d) Arg-Lys-Arg-Pro-(G!y or Ala)-Met-Tyr-lle-G1y-(Ser or Asp)-Thr 

(e) Gln-(Thr or ProMLys or Asn)-(Thr, Asp. Gly. Lys, Ser, Phe or Tyr)-Lys-Leu 

(f) (Tyr or Phe)-Lys-Gly-Leu-Gly-Glu-Met-Asn-(Ala or Pro) 

(g) Val-Glu-Gly-Asp-Ser-Ala-Gly-Gly-Ser 

45 (h) Lys-(His or Val)-Pro-Asp-Pro-(C3n or Lys)-Phe 

(i) Leu-Pro-Gly-Lys-Leu-Ala-Asp-Cys-(Ser or G!n)-(Ser or Glu)-(Lys or Arg)-Asp-Pro-(Ala or Ser) 
(D Gln-Leu-(Trp or Arg)-(Glu or Asp)-Thr-Thr-(Met or Leu)-(Asp or Asn)-Pro 
(k) Ala-(Lys or Arg)-(Lys or Arg)-Ala-Arg-Glu 
(I) Phe-Thr-Asn-Asn-lle-(Pro or Asn)-(Thr or Gin) 

50 

and which functions as a substantial primer. The primer which functions as a substantial primer used herein means an 
oligonucleotide having such a length that allows specific hybridization to a specific site in a template DNA. 
[0014] The following drawing illustrates the invention: 

55 Fig. 1 shows the locational relationship between the amino acid sequence (a) through (I) and the amino acid 
sequences of GyrB of several organisms. 

[001 5] Hereinbelow, the present invention will be described in detail. 
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[001 6] In the present invention, a part of the gyrB of an organism of interest is amplified specifically by PCR, and then 
the nucleotide sequence of the amplified sequences are determined for the taxonomic characterization of the organism. 
[001 7] As a PCR primer, an oligonucleotide may be used which codes for ail or a part of one of the following amino 
acid sequences (a) through (I): 
5 and which functions as a substantial primer. The relationship between the above amino acid sequences (a) through (I) 
and the amino acid sequences of GyrB from Bacillus subtifis 168 strain, Escherichia cofi K-12 strain and Pseudomonas 
putida PRS200 strain are shown in Fig. 1 . 

[0018] Most of the amino acid sequences listed (a) through (I) are degenerate, and numerous oligonucleotide 
sequences can be designed from the listed amino acid sequences. The following amino sequences can be enumerated 
10 as examples of amino acid sequences to be used for the design of oligonucleotide primers while the following nucle- 
otide sequences can be enumerated as examples of specific primers. 

Amino Acid and Oligonucleotides Sequences Corresponding to the Amino Acid Sequence (a): 

The amino acid sequence shown in SEQ ID NO: 26, 30, 54, 55, 56 and 57 can be given as the amino acid 
15 sequences to be used for the design of oligonucleotide primers. 

The nucleotide sequences shown in SEQ ID NO: 25, 29 and 53 can be given as the sequences of primers 

for the specific amplification of gyrB. 

Amino Acid and Oligonucleotides Sequences Corresponding to the Amino Acid Sequence (b): 

The amino acid sequence shown in SEQ ID NO: 34, 36 and 37 can be given as the amino acid sequences 
20 to be used for the design of oligonucleotide primers. 

The nucleotide sequences shown in SEQ ID NO: 33 and 35 can be given as the sequences of primers for 
the specific amplification of gyrB. 

Amino Acid and Oligonucleotides Sequences Corresponding to the Amino Acid Sequence (c): 

The amino acid sequence shown in SEQ ID NO: 28. 32 and 42 can be given as the amino acid sequences 
25 to be used for the design of oligonucleotide primers. 

The nucleotide sequences shown in SEQ ID NO: 27, 31 and 41 can be given as the sequences of primers 

for the specific amplification of gyrB. 

Amino Acid and Oligonucleotides Sequences Corresponding to the Amino Acid Sequence (d): 

The amino acid sequence shown in SEQ ID NO: 46 and 47 can be given as the amino acid sequences to be 
30 used for the design of oligonucleotide primers. 

The nucleotide sequences shown in SEQ ID NO: 45 can be given as the sequences of primers for the spe- 
cific amplification of gyrB. 

Amino Acid and Oligonucleotides Sequences Corresponding to the Amino Acid Sequence (e): 

The amino acid sequence shown in SEQ ID NO: 39 and 40 can be given as the amino acid sequences to be 
35 used for the design of oligonucleotide primers. 

The nucleotide sequences shown in SEQ ID NO: 38 can be given as the sequences of primers for the spe- 
cific ampfif ication of gyrB. 

Amino Acid and Oligonucleotides Sequences Corresponding to the Amino Acid Sequence (t): 

The amino acid sequence shown in SEQ ID NO: 44 can be given as the amino acid sequences to be used 
40 for the design of oligonucleotide primers. 

The nucleotide sequences shown in SEQ ID NO: 43 can be given as the sequences of primers for the spe- 
cific amplification of gyrB. 

Amino Acid and Oligonucleotides Sequences Corresponding to the Amino Add Sequence (g): 

The amino acid sequence shown in SEQ ID NO: 49 can be given as the amino acid sequences to be used 
45 for the design of oligonucleotide primers. 

The nucleotide sequences shown in SEQ ID NO: 48 can be given as the sequences of primers for the spe- 
cific amplification of gyrB. 

Amino Acid and Oligonucleotides Sequences Corresponding to the Amino Acid Sequence (h): 

The amino acid sequence shown in SEQ ID NO: 63 and 64 can be given as the amino acid sequences to be 
so used for the design of oligonucleotide primers. 

The nucleotide sequences shown in SEQ ID NO: 62 can be given as the sequences of primers for the spe- 
cific amplification of gyrB. 

Amino Acid and Oligonucleotides Sequences Corresponding to the Amino Acid Sequence (i): 

The amino acid sequence shown in SEQ ID NO: 59 can be given as the amino acid sequences to be used 
55 for the design of oligonucleotide primers. 

The nucleotide sequences shown in SEQ ID NO: 58 can be given as the sequences of primers for the spe- 
cific amplification of gyrB. 

Amino Acid and Oligonucleotides Sequences Corresponding to the Amino Acid Sequence Q): 
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The amino acid sequence shown in SEQ ID NO: 66, 67 and 68 can be given as the amino acid sequences 
to be used for the design of oligonucleotide primers. 

The nucleotide sequences shown in SEQ ID NO: 65 can be given as the sequences of primers for the spe- 
cific amplification of gyrB. 
s Amino Acid and Oligonucleotides Sequences Corresponding to the Amino Acid Sequence (k): 

The amino acid sequence shown in SEQ ID NO: 51 and 52 can be given as the amino acid sequences to be 
used for the design of oligonucleotide primers. 

The nucleotide sequences shown in SEQ ID NO: 50 can be given as the sequences of primers for the spe- 
cific amplification of gyrB, 
io Amino Acid and Oligonucleotides Sequences Corresponding to the Amino Acid Sequence (I): 

The amino acid sequence shown in SEQ ID NO: 61 can be given as the amino acid sequences to be used 
for the design of oligonucleotide primers. 

The nucleotide sequences shown in SEQ ID NO: 60 can be given as the sequences of primers for the spe- 
cific amplification of gyrB. 

is 

[001 9] Correspondence of each amino acid and oligonucleotide sequence are shown in Table 1 . 



Table 1 



sequence 


amino acid 


oligonucleotide 


a 


SEQ ID NO. 26 


SEQ ID NO. 25 


SEQ ID NO. 30 


SEQ ID NO. 29 


SEQ ID NO. 54,55,56,57 


SEQ ID NO. 53 


b 


SEQ ID NO. 34 


SEQ ID NO. 33 


SEQ ID NO. 36,37 


SEQ ID NO. 35 


c 


SEQ ID NO. 28 


SEQ ID NO. 27 


SEQ ID NO. 32 


SEQ ID NO. 31 


SEQ ID NO. 42 


SEQ ID NO. 41 


d 


SEQ ID NO. 46,47 


SEQ ID NO. 45 


e 


SEQ ID NO. 39,40 


SEQ ID NO. 38 


f 


SEQ ID NO. 44 


SEQ ID NO. 43 


g 


SEQ ID NO. 49 


SEQ ID NO. 48 


h 


SEQ ID NO. 63,64 


SEQ ID NO. 62 




SEQ ID NO. 59 


SEQ ID NO. 58 


i 


SEQ ID NO. 66,67.68 


SEQ ID NO. 65 


k 


SEQ ID NO. 51,52 


SEQ ID NO. 50 


I 


SEQ ID NO. 61 


SEQ ID NO. 60 



[0020] The amino acid sequences listed (a) through (I) are not necessarily conserved in all GyrB. Therefore, primers 
allowing the amplification of gyrB should be selected appropriately. 
so [0021] It is possible to directly determine the nucleotide sequence of the amplified PCR product without subcioning 
by using primers complementary to either the 5-end or the 3-end of the product 

Examples: 

55 EXAMPLE 1 

[0022] A PCR was performed using oligonucleotides represented by the nucleotide sequences shown in SEQ ID 
NOS: 25 and 27 (corresponding to the amino acid sequences of SEQ ID NOS: 26 and 28, respectively) as primers and 
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DNA from Bacteroldes vulgatus IFO 14291 strain as a template. The nucleotide sequence of the amplified DNA frag- 
ment and the amino acid sequence deduced therefrom are shown in SEQ ID NOS: 1 and 2, respectively. The PCR 
amplification conditions were as described below. 

5 



PCR amplification conditions: 


96 °C 1 min; 48°C 1 min; 72°C 2 min: 


3 cycles 


96 °C 1 min; 48°C 1 min; 72°C 2 min: 


3 cycles 


96 °C 1 min; 48°C 1 min; 72°C 2 min: 


30 cycles 


Total: 


36 cycles 



15 



Primer concentration 1 u M each 
dATP 200 \i M each 

20 Template DNA < 1 ng/100 \i I 

[0023] AmpliTaq™ and the supplied PCR Buffer (Perkin Elmer) were used. 

EXAMPLE 2 

25 

[0024] A PCR was performed using oligonucleotides represented by the nucleotide sequences shown in SEQ ID 
NOS: 29 and 31 (corresponding to the amino acid sequences of SEQ ID NOS: 30 and 32, respectively) as primers and 
DNA from Mycobacterium simiae KPM 1403 strain as a template. The PCR amplification conditions were the same as 
in Example 1 . The nucleotide sequence of the amplified DNA fragment and the amino acid sequence deduced there- 
30 from are shown in SEQ ID NOS: 3 and 4, respectively. 

EXAMPLE 3 

[0025] A PCR was performed using oligonucleotides represented by the nucleotide sequences shown in SEQ ID 
35 NOS: 33 and 27 (corresponding to the amino acid sequences of SEQ ID NOS: 34 and 28, respectively) as primers and 
DNA from Chitinophaga pinensis DSM 2588 strain as a template. The PCR amplification conditions were the same as 
in Example 1 . The nucleotide sequence of the amplified DNA fragment and the amino acid sequence deduced there- 
from are shown in SEQ ID NOS: 5 and 6, respectively. 

40 EXAMPLE 4 

[0026] A PCR was performed using oligonucleotides represented by the nucleotide sequences shown in SEQ ID 
NOS: 25 and 35 (corresponding to the amino acid sequences of SEQ ID NO: 26 and SEQ ID NO: 36 or 37, respectively) 
as primers and DNA from Ffavobacterium aquatife I AM 12316 strain as a template. The PCR amplification conditions 
45 were the same as in Example 1 . The nucleotide sequence of the amplified DNA fragment and the amino acid sequence 
deduced therefrom are shown in SEQ ID NOS: 7 and 8, respectively. 

EXAMPLE 5 

so [0027] A PCR was performed using oligonucleotides represented by the nucleotide sequences shown in SEQ ID 
NOS: 29 and 38 (corresponding to the amino acid sequences of SEQ ID NO: 30 and SEQ ID NO: 39 or 40, respectively) 
as primers and DNA from Mycobacterium asiaticum ATCC 25274 strain as a template. "The PCR amplification condi- 
tions were the same as in Example 1. The nucleotide sequence of the amplified DNA fragment and the amino acid 
sequence deduced therefrom are shown in SEQ ID NOS: 9 and 10. respectively. 

55 

EXAMPLE 6 

[0028] A PCR was performed using oligonucleotides represented by the nucleotide sequences shown in SEQ ID 



6 




EP0 935 003 A2 

NOS: 41 and 43 (corresponding to the amino acid sequences of SEQ ID NOS: 42 and 44, respectively) as primers and 
DNA from Cytophaga fytica IFO 16020 strain as a template. The PCR amplification conditions were the same as in 
Example 1 . The nucleotide sequence of the amplified DNA fragment and the amino acid sequence deduced therefrom 
are shown in SEQ ID NOS: 11 and 12, respectively. 

5 

EXAMPLE 7 

[0029] A PCR was performed using oligonucleotides represented by the nucleotide sequences shown in SEQ ID 
NOS: 45 and 48 (corresponding to the amino acid sequences of SEQ ID NO: 46 or 47 and SEQ ID NO: 49, respectively) 
w as primers and DNA from Synechococcus sp. PCC 6301 strain as a template. The PCR amplification conditions were 
the same as in Example 1. The nucleotide sequence ofjthe amplified DNA fragment and the amino acid sequence 
deduced therefrom are shown in SEQ ID NOS: 13 and 14* respectively 

EXAMPLE 8 

15 

[0030] A PCR was performed using oligonucleotides represented by the nucleotide sequences shown in SEQ ID 
NOS: 53 and 62 (corresponding to the amino acid sequences of SEQ ID NO: 54, 55, 56 or 57 and SEQ ID NO: 63 or 
64, respectively) as primers and DNA from Caulobacter crescentus ATCC 1 5252 strain as a template. The PCR ampli- 
fication conditions were the same as in Example 1. The nucleotide sequence of the amplified DNA fragment and the 
20 amino acid sequence deduced therefrom are shown in SEQ ID NOS: 15 and 16, respectively. 

EXAMPLE 9 

[0031] A PCR was performed using oligonucleotides represented by the nucleotide sequences shown in SEQ ID 
25 NOS: 53 and 58 (corresponding to the amino acid sequences of SEQ ID NO: 54, 55, 56 or 57 and SEQ ID NO: 59, 
respectively) as primers and DNA from Pseudomonas putida ATCC 1 7484 strain as a template. The PCR amplification 
conditions were the same as in Example 1 . The nucleotide sequence of the amplified DNA fragment and the amino acid 
sequence deduced therefrom are shown in SEQ ID NOS: 17 and 18, respectively. 

30 EXAMPLE 10 

[0032] A PCR was performed using oligonucleotides represented by the nucleotide sequences shown in SEQ ID 
NOS: 65 and 50 (corresponding to the amino acid sequences of SEQ ID NO: 66, 67 or 68 and SEQ ID NO: 51 or 52, 
respectively) as primers and DNA from Synechococcus sp. PCC 6301 strain as a template. The PCR amplification con- 
35 ditions were the same as in Example 1 . The nucleotide sequence of the amplified DNA fragment and the amino acid 
sequence deduced therefrom are shown in SEQ ID NOS: 19 and 20, respectively. 

EXAMPLE 11 

40 [0033] A PCR was performed using oligonucleotides represented by the nucleotide sequences shown in SEQ ID 
NOS: 60 and 31 (corresponding to the amino acid sequences of SEQ ID NOS: 61 and 32, respectively) as primers and 
DNA from Caubbacter crescentus ATCC 15252 strain as a template. The PCR amplification conditions were the same 
as in Example 1 . The nucleotide sequence of the amplified DNA fragment and the amino acid sequence deduced there- 
from are shown in SEQ ID NOS: 21 and 22, respectively, 

45 

EXAMPLE 12 

[0034] A PCR was performed using oligonucleotides represented by the nucleotide sequences shown in SEQ ID 
NOS: 25 and 43 (corresponding to the amino acid sequences of SEQ ID NOS: 26 and 44, respectively) as primers and 
so DNA from an unidentified strain MBIC 1544 as a template. The PCR amplification conditions were the same as Exam- 
ple 1. Trte nucleotide sequence of the amplified DNA fragment and the amino acid sequence deduced therefrom are 
shown in SEQ ID NOS: 23 and 24, respectively. 

[0035] This nucleotide sequence was compared with the nucleotide sequence database possessed by the applicant 
As a resuft, the unidentified strain MBIC 1544 was identified as Cytophaga fytica. 
55 [0036] With the nucleotide sequence of gyrB determined by the present invention, it is possible to classify or identify 
an unidentified microorganism strain quickly and accurately. Besides, according to the present invention, PCR primers 
for monitoring a specific microorganism which are needed in risk assessment in various bioprocesses can be designed 
easily. Also, the present invention enables highly accurate monitoring of changes in mycelial tufts. 
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[0037] The invention thus allows the determination of the presence or amount of a microorganism in a sample. The 
invention may be applicable in medical and industrial contexts. For example, in a medical context a sample can be 
tested for the presence of a microorganism, which may be useful in assessing infection. Therefore, the sample could 
be serum, blood plasma, or a swab from the eye, ear, mouth, throat urethra, cervix, vagina, penis or rectum. Alterna- 
tively the sample could be a sweep from a culture of bacteria grown on solid or in a liquid media. 
[0038] In an industrial context a sample could be tested to determine the amount of a microorganism in a fermenta- 
tion process. Alternatively a sample could be tested to assess contamination of fermentation broths by unwanted micro- 
organisms. Thus, the sample can be a sample from any bioprocess or fermentation process, where the amount of or 
presence of a microorganism needs to be ascertained. 
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SEQUENCE LISTING 

SEQ ID NO: 1 
SEQUENCE LENGTH: 1212 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: genomic DNA 
ORIGINAL SOURCE 

ORGANISM: Bacteroides vulgatus 
20 STRAIN: IPO 14291 

SEQUENCE DESCRIPTION 

GAC AAA GGT TCT TAC AAG GTT TCA GGC GGT CTG CAC GGT GTA GGT GTT 48 
Asp Lys Gly Ser Tyr Lys Val Ser Gly Gly Leu His Gly Val Gly Val 

15 10 15 

TCT TGT GTG AAC GCC TTG TCT ACT CAC ATG ACC ACA CAG GTA TTC CGC 96 
Ser Cys Val Asn Ala Leu Ser Thr His Met Thr Thr Gin Val Phe Arg 

20 25 30 

GGT GGC AAG ATC TAC CAG CAG GAA TAC AGC TGC GGA CAT CCT TTG TAT 144 
Gly Gly Lys He Tyr Gin Gin Glu Tyr Ser Cys Gly His Pro Leu Tyr 
35 40 45 

40 TCT GTA AAA GAA GTA GGA ACA GCT GAT ATT ACC GGA ACA AAA CAG ACT 192 

Ser Val Lys Glu Val Gly Thr Ala Asp He Thr Gly Thr Lys Gin Thr 

50 55 60 

TTC TGG CCG GAT GAT ACC ATC TTC ACT GTT ACC GAA TAT AAG TTT GAC 240 
Phe Trp Pro Asp Asp Thr He Phe Thr Val Thr Glu Tyr Lys Phe Asp 
65 70 75 80 

ATT CTA CAG GCA CGT ATG CGT GAA TTG GCC TAC TTG AAC AAA GGT ATC 288 
He Leu Gin Ala Arg Met Arg Glu Leu Ala Tyr Leu Asn Lys Gly He 

55 



25 



30 



35 



45 



50 
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10 



15 



85 90 95 

ACC ATT TCA CTG ACC GAC CGC CGG ATC AAA GAA GAA GAT GGC AGC TTC 336 
Thr lie Ser Leu Thr Asp Arg Arg lie Lys Glu Glu Asp Gly Ser Phe 

100 105 110 

AAG AAA GAA ATA TTC CAT TOG GAC GAA GGA GTG AAA GAG TTT GTA CGT 384 
Lys Lys Glu lie Phe His Ser Asp Glu Gly Val Lys Glu Phe Val Arg 

115 120 125 

TTC CTG AAC CGT AAC AAC GAA GCG CTG ATT AAT GAT GTC ATT TAT CTG 432 
Phe Leu Asn Arg Asn Asn Glu Ala Leu lie Asn Asp Val lie Tyr Leu 
130 135 140 



25 



30 



35 



40 



45 



50 



AAT ACC GAA AAA AAC AAT ACC CCC ATT GAA TGT GCC ATC ATG TAC AAT 480 
Asn Thr Glu Lys Asn Asn Thr Pro He Glu Cys Ala He Met Tyr Asn 
145 150 155 160 

ACA GGC TAT CGT GAA AGC CTG CAT TCG TAT GTA AAC AAT ATC AAT ACA 528 
Thr Gly Tyr Arg Glu Ser Leu His Ser Tyr Val Asn Asn He Asn Thr 

165 170 175 

ATA GAA GGC GGT ACA CAC GAG GCC GGT TTC CGC AGC GCA TTA ACC CGT 576 
He Glu Gly Gly Thr His Glu Ala Gly Phe Arg Ser Ala Leu Thr Arg 

180 185 190 

GTA CTG AAG AAA TAT GCG GAA GAT ACC AAA GCA CTG GAA AAA GCA AAA 624 
Val Leu Lys Lys Tyr Ala Glu Asp Thr Lys Ala Leu Glu Lys Ala Lys 

195 200 205 

GTC GAG ATT TCG GGA GAG GAC TTC CGC GAA GGC TTG ATT GCC GTC ATT 672 
Val Glu He Ser Gly Glu Asp Phe Arg Glu Gly Leu He Ala Val He 

210 215 220 

TCA GTG AAA GTA GCC GAG CCG CAG TTC GAA GGA CAG ACC AAG ACC AAG 720 
Ser Val Lys Val Ala Glu Pro Gin Phe Glu Gly Gin Thr Lys Thr Lys 
225 230 235 240 
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20 



30 



35 



40 



45 



CTG GGC AAC 
Leu Gly Asn 

GCG CTT ACA 
Ala Leu Thr 

GTT GAC AAA 

Val Asp Lys 
275 

GCA CGT GAA 
Ala Arg Glu 
290 

CCG GGC AAA 
Pro Gly Lys 
305 

CTA TTC CTG 
Leu Phe Leu 

CGT AGC CGT 
Arg Ser Arg 

AAT GTG GAA 
Asn Val Glu 
355 

AAT AAT ATC 
Asn Asn He 

370 
GAT GAC AGC 
Asp Asp Ser 



AGC GAA 
Ser Glu 
245 
TAT TAT 
Tyr Tyr 
260 

GTG ATC 
Val He 

TCT GTT 
Ser Val 

CTG GCC 
Leu Ala 

GTC GAG 
Val Glu 
325 
GCC TTC 
Ala Phe 
340 

AAA GCG 
Lys Ala. 

ATC ACC 
He Thr 

AAA AAA 
Lys Lys 



GTG AGT GGT GCC 
Val Ser Gly Ala 



CTG GAA 
Leu Glu 

CTG GCT 
Leu Ala 

CAA AGA 
Gin Arg 
295 
GAC TGC 
Asp Cys 
310 

GGT GAC 
Gly Asp 

CAG GCA 
Gin Ala 

ATG TGG 
Met Trp 

GCC CTG 
Ala Leu 
375 
GCG AAC 
Ala Asn 



GAA CAT 
Glu His 
265 
GCA ACA 
Ala Thr 
280 

AAG AGT 
Lys Ser 

TCG AGC 
Ser Ser 

TOG GCA 
Ser Ala 

ATT CTA 
He Leu 
345 
CAC AAG 
His Lys 
360 

GGT GTC 
Gly Val 

ATC GAC 
lie Asp 



GTG AAC 
Val Asn 
250 

CCG AAA 
Pro Lys 

GCG CGT 
Ala Arg 

CCG ATG 
Pro Met 

CGT AAT 
Arg Asn 
315 
GGT GGT 
Gly Gly 
330 

CCT TTG 
Pro Leu 

GCT TTT 
Ala Phe 

CGT TTC 
Arg Phe 

AAG CTG 
Lys Leu 



CAA GCT 
Gin Ala 

GAA GCA 
Glu Ala 

ATC GCC 
He Ala 
285 
GGC GGT 
Gly Gly 
300 

CCG GAG 
Pro Glu 

TCT GCC 
Ser Ala 

AGG GGT 
Arg Gly 

GAA AGC 
Glu Ser 
365 
GGT GTG 
Gly Val 
380 

CGT TAT 
Arg Tyr 



GTA GGC GAA 
Val Gly Glu 

255 
AAA CAG ATT 
Lys Gin He 
270 

GCA CGC AAG 
Ala Arg Lys 

GGC GGA CTG 
Gly Gly Leu 

GAA TGT GAA 
Glu Cys Glu 
320 

AAG CAA GGA 
Lys Gin Gly 

335 
AAA ATC CTG 
Lys He Leu 
350 

GAT GAG GTC 
Asp Glu Val 

GAC GGA AAT 
Asp Gly Asn 

CAC AAA GTG 
His Lys Val 



768 



816 



864 



912 



960 



1008 



1056 



1104 



1152 



1200 
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385 390 395 400 

GTG ATC ATG ACC 
Val lie Met Thr 

SEQ ID NO: 2 
SEQUENCE LENGTH: 404 
SEQUENCE TYPE: amino acid 
TOPOLOGY: unknown 
MOLECULE TYPE: protein 
ORIGINAL SOURCE 

ORGANISM: Bacteroides vulgatus 

STRAIN: IFO 14291 
SEQUENCE DESCRIPTION 

Asp Lys Gly Ser Tyr Lys Val Ser Gly Gly Leu His Gly Val Gly Val 

15 10 15 

Ser Cys Val Asn Ala Leu Ser Thr His Met Thr Thr Gin Val Phe Arg 

20 25 30 

Gly Gly Lys lie Tyr Gin Gin Glu Tyr Ser Cys Gly His Pro Leu Tyr 

35 40 45 

Ser Val Lys Glu Val Gly Thr Ala Asp lie Thr Gly Thr Lys Gin Thr 

50 55 60 

Phe Trp Pro Asp Asp Thr lie Phe Thr Val Thr Glu Tyr Lys Phe Asp 
€5 70 75 80 

He Leu Gin Ala Arg Met Arg Glu Leu Ala Tyr Leu Asn Lys Gly He 

85 90 95 

Thr He Ser Leu Thr Asp Arg Arg lie Lys Glu Glu Asp Gly Ser Phe 

100 105 110 

Lys Lys Glu He Phe His Ser Asp Glu Gly Val Lys Glu Phe Val Arg 
115 120 125 
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20 



30 



35 
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45 



50 



Phe Leu Asn Arg Asn Asn Glu Ala Leu He Asn Asp Val He Tyr Leu 

130 135 140 

Asn Thr Glu Lys Asn Asn Thr Pro lie Glu Cys Ala He Met Tyr Asn 
145 150 155 160 

Thr Gly Tyr Arg Glu Ser Leu His Ser Tyr Val Asn Asn He Asn Thr 

165 170 175 

He Glu Gly Gly Thr His Glu Ala Gly Phe Arg Ser Ala Leu Thr Arg 

180 185 190 

Val Leu Lys Lys Tyr Ala Glu Asp Thr Lys Ala Leu Glu Lys Ala Lys 

195 200 205 

Val Glu lie Ser Gly Glu Asp Phe Arg Glu Gly Leu He Ala Val He 

210 215 220 

Ser Val Lys Val Ala Glu Pro Gin Phe Glu Gly Gin Thr Lys Thr Lys 
225 230 235 240 

Leu Gly Asn Ser Glu Val Ser Gly Ala Val Asn Gin Ala Val Gly Glu 

245 250 255 

Ala Leu Thr Tyr Tyr Leu Glu Glu His Pro Lys Glu Ala Lys Gin He 

260 265 270 

val Asp Lys Val He Leu Ala Ala Thr Ala Arg He Ala Ala Arg Lys 

275 280 285 

Ala Arg Glu Ser Val Gin Arg Lys Ser Pro Met Gly Gly Gly Gly Leu 

290 295 300 

Pro Gly Lys Leu Ala Asp Cys Ser Ser Arg Asn Pro Glu Glu Cys Glu 
305 310 315 320 

Leu Phe Leu Val Glu Gly Asp Ser Ala Gly Gly Ser Ala Lys Gin Gly 

325 330 335 

Arg Ser Arg Ala Phe Gin Ala He Leu Pro Leu Arg Gly Lys He Leu 

340 345 350 

Asn Val Glu Lys Ala Met Trp His Lys Ala Phe Glu Ser Asp Glu Val 



55 
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355 360 365 

Asn Asn lie He Thr Ala Leu Gly Val Arg Phe Gly Val Asp Gly Asn 

370 375 380 

Asp Asp Ser Lys Lys Ala Asn He Asp Lys Leu Arg Tyr His Lys Val 
385 390 395 400 

Val He Met Thr 



20 



25 



30 



15 SEQ XD NO: 3 

SEQUENCE LENGTH: 1263 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: genomic DNA 
ORIGINAL SOURCE 

ORGANISM: Mycobacterium simiae 
STRAIN: KPM 1403 
SEQUENCE DESCRIPTION 

GGG GAG AAC AGT GGC TAC ACC GTC AGC GGC GGG TTG CAC GGG GTC GGA 48 
35 Gly Glu Asn Ser Gly Tyr Thr Val Ser Gly Gly Leu His Gly Val Gly 

15 10 15 

GTG TCG GTG GTC AAC GCC CTG TCC ACC CGC CTG GAA GTC AAC GTC AAG 96 
40 Val Ser Val Val Asn Ala Leu Ser Thr Arg Leu Glu Val Asn Val Lys 

20 25 30 

CGT GAC GGC TAT GAG TGG TTC CAG TAC TAC GAC CGG GCG GTG CCC GGC 144 
Arg Asp Gly Tyr Glu Trp Phe Gin Tyr Tyr Asp Arg Ala Val Pro Gly 

35 40 45 

ACC CTC AAG CAA GGC GAG GCG ACC AAG AAG ACC GGC ACC ACG ATC CGG 192 
Thr Leu Lys Gin Gly Glu Ala Thr Lys Lys Thr Gly Thr Thr lie Arg 
50 55 60 
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25 



30 



35 



40 



45 



50 



TTC TGG GCC 
Phe Trp Ala 
65 

ACG GTG GCG 
Thr Val Ala 

ACC ATC AAC 
Thr He Asn 

GAG GTG GTT 
Glu Val Val 
115 

GCG GCC GAA 
Ala Ala Glu 

130 
TAC CCG GGT 
Tyr Pro Gly 
145 

AAC CCG ATC 
Asn Pro lie 

CAC GAA GTC 
His Glu Val 

GTG CAC ACC 
Val His Thr 
195 

GAG GAG GGC 
GlU Glu Gly 



GAT CCT GAG 
Asp Pro Glu 
70 

CGC CGG TTG 
Arg Arg Leu 
85 

CTC ACC GAC 
Leu Thr Asp 
100 

AGC GAC ACC 
Ser Asp Thr 

TCG GCC AAG 
Ser Ala Lys 

GGG TTG GTG 
Gly Leu Val 
150 

CAG CAG AGC 
Gin Gin Ser 

165 
GAG ATC GCG 
Glu He Ala 
180 

TTC GCC AAC 
Phe Ala Asn 

TTC CGC AGC 
Phe Arg Ser 



ATC TTC GAA ACC 
He Phe Glu Thr 



CAG GAA 
Gin Glu 

GAA CGT 
GlU Arg 

GCC GAG 
Ala Glu 
120 
CCG CAC 
Pro His 
135 

GAT TTC 
Asp Phe 

GTC ATC 
Val He 

ATG CAG 
Met Gin 

ACC ATC 
Thr He 
200 
GCG CTG 
Ala Leu 



ATG GCG 
Met Ala 
90 

GTC GAG 
Val Glu 
105 

GCG CCG 
Ala Pro 

AAG GTC 
Lys Val 

GTC AAG 
Val Lys 

GAC TTC 
Asp Phe 
170 
TGG AAC 
Trp Asn 
185 

AAC ACC 
Asn Thr 

ACC TCG 
Thr Ser 



ACC CAG 
Thr Gin 
75 

TTC CTC 
Phe Leu 

CAG GAC 
Gin Asp 

AAG TCA 
Lys Ser 

AAG CAC 
Lys His 
140 
CAC ATC 
His He 
155 

GAC GGC 
Asp Gly 

GGT GGT 
Gly Gly 

CAT GAG 
His Glu 

GTG GTG 
Val Val 



TAC GAC 
Tyr Asp 

AAC AAG 
Asn Lys 

GAG GTG 
GlU Val 
110 
GCC GAG 
Ala GlU 
125 

CGC ACG 
Arg Thr 

AAT CGC 
Asn Arg 

AAA GGA 

Lys Gly 

TAT TCG 
Tyr Ser 
190 
GGC GGC 
Gly Gly 
205 

AAC AAG 
Asn Lys 



TTC GAG 
Phe GlU 
80 

GGC CTG 
Gly Leu 
95 

GTC GAT 
Val Asp 

GAG CAG 
Glu Gin 

TTC CAC 
Phe His 

ACC AAA 
Thr Lys 
160 
ACC GGG 
Thr Gly 
175 

GAG TCG 
Glu Ser 

ACC CAC 
Thr His 

TAC GCC 
Tyr Ala 
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210 215 220 

AAA GAC AAG AAG CTG CTC AAG GAC AAG GAT CCC AAC CTC ACC GGC GAC 720 
Lys Asp Lys Lys Leu Leu Lys Asp Lys Asp pro Asn Leu Thr Gly Asp 
225 230 235 240 

GAC ATC CGA GAA GGG CTG GCC GCG GTG ATC TCC GTG AAG GTC GCC GAG 768 
Asp lie Arg Glu Gly Leu Ala Ala Val lie Ser Val Lys Val Ala Glu 

245 250 255 

CCG CAG TTC GAG GGC CAG ACT AAG ACG AAA CTC GGC AAC ACC GAG GTC 816 
Pro Gin Phe Glu Gly Gin Thr Lys Thr Lys Leu Gly Asn Thr Glu Val 

260 265 270 

AAG TCG TTT GTC CAG AAA GTC TGT AAC GAA CAA CTC ACT CAC TGG TTC 864 
Lys Ser Phe Val Gin Lys Val Cys Asn Glu Gin Leu Thr His Trp Phe 

275 280 285 

GAG GCG AAC CCG TCG GAA GCT AAA ACC GTT GTA AAC AAG GCG GTT TCG 912 
Glu Ala Asn Pro Ser Glu Ala Lys Thr Val Val Asn Lys Ala Val Ser 

290 295 300 

TCG GCC CAG GCC CGC ATT GCG GCG CGT AAG GCG CGG GAG TTG GTG CGG 960 
Ser Ala Gin Ala Arg lie Ala Ala Arg Lys Ala Arg Glu Leu Val Arg 
305 310 315 320 

CGT AAG AGT GCT ACG GAT TTG GGT GGG TTG CCG GGC AAG TTG GCT GAT 1008 
Arg Lys Ser Ala Thr Asp Leu Gly Gly Leu Pro Gly Lys Leu Ala Asp 

325 330 335 

TGC CGC TCG ACG GAT CCG CGG AAG TCT GAG CTG TAT GTG GTG GAA GGT 1056 
Cys Arg Ser Thr Asp Pro Arg Lys Ser Glu Leu Tyr Val Val Glu Gly 

340 345 350 

GAT TCC GCG GGT GGG TOG GCG AAA AGT GGG CGT GAT TCG ATG TTC CAG 1104 
Asp Ser Ala Gly Gly Ser Ala Lys Ser Gly Arg Asp Ser Met Phe Gin 

355 360 365 

GCG ATC TTG CCG CTG CGC GGC AAG ATC ATC AAC GTC GAA AAG GCC CGC 1152 
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Ala lie Leu Pro Leu Arg Gly Lys He He Asn Val Glu Lys Ala Arg 

370 375 380 

ATC GAT CGG GTG CTG AAA AAC ACC GAA GTC CAG GCC ATC ATC ACC GCG 1200 
He Asp Arg Val Leu Lys Asn Thr Glu Val Gin Ala He He Thr Ala 
385 390 395 400 

CTG GGC ACC GGC ATC CAC GAC GAA TTC GAC ATC ACC AAA CTG CGT TAC 1248 
Leu Gly Thr Gly He His Asp Glu Phe Asp He Thr Lys Leu Arg Tyr 

405 410 415 

CAC AAG ATC GTG TTG 1263 
His Lys lie Val Leu 
20 420 



w 



15 



SEQ ID NO: 4 
25 SEQUENCE LENGTH: 421 

SEQUENCE TYPE: amino acid 
TOPOIOGY: unknown 

30 

MOLECULE TYPE: protein 
ORIGINAL SOURCE 

ORGANISM: Mycobacterium simiae 

35 

STRAIN: KPM 1403 
SEQUENCE DESCRIPTION 
40 Gly Glu Asn Ser Gly Tyr Thr Val Ser Gly Gly Leu His Gly Val Gly 

15 10 15 

Val Ser Val val Asn Ala Leu Ser Thr Arg Leu Glu Val Asn Val Lys 
45 20 25 30 

Arg Asp Gly Tyr Glu Trp Phe Gin Tyr Tyr Asp Arg Ala Val Pro Gly 
35 40 45 

50 

Thr Leu Lys Gin Gly Glu Ala Thr Lys tys Thr Gly Thr Thr He Arg 
50 55 60 
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Phe Trp Ala Asp Pro Glu He Phe Glu Thr Thr Gin Tyr Asp Phe Glu 
65 70 75 80 

Thr Val Ala Arg Arg Leu Gin Glu Met Ala Phe Leu Asn Lys Gly Leu 

85 90 95 

Thr He Asn Leu Thr Asp Glu Arg Val Glu Gin Asp Glu Val Val Asp 

100 105 110 

Glu Val Val Ser Asp Thr Ala Glu Ala Pro Lys Ser Ala Glu Glu Gin 

115 120 125 

Ala Ala Glu Ser Ala Lys Pro His Lys Val Lys His Arg Thr Phe His 

130 135 140 

Tyr Pro Gly Gly Leu Val Asp Phe Val Lys His He Asn Arg Thr Lys 
145 150 155 160 

Asn Pro He Gin Gin Ser Val He Asp Phe Asp Gly Lys Gly Thr Gly 

165 170 175 

His Glu Val Glu He Ala Met Gin Trp Asn Gly Gly Tyr Ser Glu Ser 

180 185 190 

Val His Thr Phe Ala Asn Thr He Asn Thr His Glu Gly Gly Thr His 

195 200 205 

Glu Glu Gly Phe Arg Ser Ala Leu Thr Ser Val Val Asn Lys Tyr Ala 

210 215 220 

Lys Asp Lys Lys Leu Leu Lys Asp Lys Asp Pro Asn Leu Thr Gly Asp 
225 230 235 240 

Asp He Arg Glu Gly Leu Ala Ala Val lie Ser Val Lys Val Ala Glu 

245 250 255 

Pro Gin Phe Glu Gly Gin Thr Lys Thr Lys Leu Gly Asn Thr Glu Val 

260 265 270 

Lys Ser Phe Val Gin Lys Val Cys Asn Glu Gin Leu Thr His Trp Phe 

275 280 285 

Glu Ala Asn Pro Ser Glu Ala Lys Thr Val Val Asn Lys Ala Val Ser 
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290 295 300 

Ser Ala Gin Ala Arg lie Ala Ala Arg Lys Ala Arg Glu Leu Val Arg 
305 310 315 320 

Arg Lys Ser Ala Thr Asp Leu Gly Gly Leu Pro Gly Lys Leu Ala Asp 

325 330 335 

Cys Arg Ser Thr Asp Pro Arg Lys Ser Glu Leu Tyr Val Val Glu Gly 

340 345 350 

Asp Ser Ala Gly Gly Ser Ala Lys Ser Gly Arg Asp Ser Met Phe Gin 

355 360 365 

Ala lie Leu Pro Leu Arg Gly Lys lie He Asn Val Glu Lys Ala Arg 

370 375 380 

He Asp Arg Val Leu Lys Asn Thr Glu Val Gin Ala He He Thr Ala 
385 390 395 400 

Leu Gly Thr Gly He His Asp Glu Phe Asp lie Thr Lys Leu Arg Tyr 

405 410 415 

His Lys He Val Leu 
420 

SEQ ID NO: 5 
SEQUENCE LENGTH: 660 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
MOLECULE TYPE: genomic DNA 
ORIGINAL SOURCE 

ORGANISM: Chi t inophaga pinensis 

STRAIN: DSM 2588 
SEQUENCE DESCRIPTION 

GTA GCA GGC TTC CGC CGT GCG ATA ACC CGT ATC TTC AAG AGC TAT GGT 



19 



EP0935 003A2 



10 
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20 



30 



35 



40 



45 



50 



Val Ala Gly Phe Arg Arg Ala He Thr Arg He Phe Lys Ser Tyr Gly 

15 10 15 

GAT AAG AAC AAA ATG TTC GAA AAA ACC AAG ATC GAA GTA ACA GGT GAT 96 
Asp Lys Asn Lys Met phe Glu Lys Thr Lys He Glu Val Thr Gly Asp 

20 25 30 

GAC TTC CGT GAA GGT CTG AGC GCT ATC ATC AGC GTA AAA GTA CCT GAA 144 
Asp Phe Arg Glu Gly Leu Ser Ala He He Ser Val Lys Val Pro Glu 

35 40 45 

CCA CAG TTC GAA GGC CAG ACC AAA ACC AAA CTC GGT AAC TCC GAT GTA 192 
Pro Gin Phe Glu Gly Gin Thr Lys Thr Lys Leu Gly Asn Ser Asp Val 

50 55 60 

ATG GGG GTT GTG GAC AGT TCC GTA GCA GCC GTA CTG GAT GCC TAG CTG 240 
Met Gly Val Val Asp Ser Ser Val Ala Ala Val Leu Asp Ala Tyr Leu 
65 70 75 80 

GAA GAA CAT CCC CGC GAA GCC AAG ATC ATT ATC AAT AAA GTG GTA CTG 288 
Glu Glu His Pro Arg Glu Ala Lys He He He Asn Lys Val Val Leu 

85 90 95 

GCA GCA CAG GCG CGT GAA GCA GCC CGT AAA GCA CGC CAG ATG GTA CAG 336 
Ala Ala Gin Ala Arg Glu Ala Ala Arg Lys Ala Arg Gin Met Val Gin 

100 105 110 

CGT AAG AGC GTA CTG AGT GGA AGC GGC TTG CCT GGT AAA CTG GCT GAC 384 
Arg Lys Ser Val Leu Ser Gly Ser Gly Leu Pro Gly Lys Leu Ala Asp 

115 120 125 

TGC TCT GAA AAT GAT CCT GAA AAA TGT GAA CTG TAC CTG GTA GAG GGT 432 
Cys Ser Glu Asn Asp Pro Glu Lys Cys Glu Leu Tyr Leu Val Glu Gly 

130 135 140 

GAC TCC GCA GGT GGT ACG GCT AAA CAA GGA CGT AAC CGT AGC TTC CAG 480 
Asp Ser Ala Gly Gly Thr Ala Lys Gin Gly Arg Asn Arg Ser Phe Gin 
145 150 155 160 
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GCG ATC CTG CCG CTC AGG GGT AAA ATC CTG AAC GTG GAG AAA GCC ATG 528 
Ala He Leu Pro Leu Arg Gly Lys He Leu Asn Vai Glu Lys Ala Met 

165 170 175 

GAG CAT AAG ATA TAT GAG AAT GAG GAG ATT AAA AAC ATC TTC ACC GCA 576 
Glu His Lys He Tyr Glu Asn Glu Glu He Lys Asn He Phe Thr Ala 

180 185 190 

CTT GGT GTA ACC ATC GGT ACG GAA GAA GAT GAC AAA GCC CTC AAC CTC 624 
Leu Gly Val Thr He Gly Thr Glu Glu Asp Asp Lys Ala Leu Asn Leu 

195 200 205 

TCC AAA CTG CGC TAT CAC AAA CTG ATC ATC ATG ACG 660 
Ser Lys Leu Arg Tyr His Lys Leu lie He Met Thr 
210 215 220 



25 SEQ ID NO: 6 

SEQUENCE LENGTH: 220 
SEQUENCE TYPE: amino acid 
30 TOPOLOGY: unknown 

MOLECULE TYPE: protein 
ORIGINAL SOURCE 

ORGANISM: Chit inophaga pinensis 
STRAIN: DSM 2588 
SEQUENCE DESCRIPTION 

Val Ala Gly Phe Arg Arg Ala He Thr Arg He Phe Lys Ser Tyr Gly 

1 5 10 15 

Asp Lys Asn Lys Met Phe Glu Lys Thr Lys He Glu Val Thr Gly Asp 

20 25 30 

Asp Phe Arg Glu Gly Leu Ser Ala He He Ser Val Lys Val Pro Glu 

35 40 45 

Pro Gin Phe Glu Gly Gin Thr Lys Thr Lys Leu Gly Asn Ser Asp Val 



35 
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50 
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50 55 60 

Met Gly Val Val Asp Ser Ser Val Ala Ala Val Leu Asp Ala Tyr Leu 
65 70 75 80 

Glu Glu His Pro Arg Glu Ala Lys lie lie He Asn Lys Val Val Leu 

85 90 95 

Ala Ala Gin Ala Arg Glu Ala Ala Arg Lys Ala Arg Gin Met Val Gin 
100 105 110 

15 Arg Lys Ser Val Leu Ser Gly Ser Gly Leu Pro Gly Lys Leu Ala Asp 

115 120 125 

Cys Ser Glu Asn Asp Pro Glu Lys Cys Glu Leu Tyr Leu Val Glu Gly 

130 135 140 

Asp Ser Ala Gly Gly Thr Ala Lys Gin Gly Arg Asn Arg Ser Phe Gin 
145 150 155 160 

Ala lie Leu Pro Leu Arg Gly Lys He Leu Asn Val Glu Lys Ala Met 
165 170 y 175 



20 



25 



30 



Glu His Lys He Tyr Glu Asn Glu Glu He Lys Asn He Phe Thr Ala 

180 185 190 

Leu Gly Val Thr lie Gly Thr Glu Glu Asp Asp Lys Ala Leu Asn Leu 
35 195 200 205 

Ser Lys Leu Arg Tyr His Lys Leu He He Met Thr 
210 215 220 



40 



45 



50 
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SEQ ID NO: 7 
SEQUENCE LENGTH: 537 
SEQUENCE TYPE: nucleic acid 
STRAND EDNES S : double 
TOPOLOGY: linear 
MOLECULE TYPE: genomic DNA 
ORIGINAL SOURCE 



22 



10 



15 



EP0935 003A2 

ORGANISM: Fiavobac t er ium aqnatile 
STRAIN: I AM 12316 
SEQUENCE DESCRIPTION 

GAT AAA GAT TCT TAT AAA GTT TCG GGT GGA CTT CAC GGA GTT GGT GTT 48 
Asp Lys Asp Ser Tyr Lys Val Ser Gly Gly Leu His Gly Val Gly Val 

! 5 10 15 

TCT TGC GTT AAT GCA CTT TCT GAT AAC CTA AAA GCA ACC GTT TTT AGA 96 
Ser Cys Val Asn Ala Leu Ser Asp Asn Leu Lys Ala Thr Val Phe Arg 

20 25 30 

GAC GGA AAA GTG TAC GAG CAA GAA TAT GAA AAA GGT AAA GCA ATG TAT 144 
20 Asp Gly Lys Val Tyr Glu Gin Glu Tyr Glu Lys Gly Lys Ala Met Tyr 

35 40 45 

CCG GTT AAG CAA GTT GGT GAA ACA ACA AAG CGA GGA ACA ATG GTT ACT 192 
25 Pro Val Lys Gin Val Gly Glu Thr Thr Lys Arg Gly Thr Met Val Thr 

50 55 60 

TTT CAT CCT GAT AAA ACC ATT TTT ACT CAA ACA ATT GAG TAT TCT TAT 240 
Phe His Pro Asp Lys Thr lie Phe Thr Gin Thr lie Glu Tyr Ser Tyr 
65 70 75 80 

GAT ACA CTT GCA GCA CGT ATG CGT GAA TTA TCT TTC CTG AAT AAA GGA 288 
Asp Thr Leu Ala Ala Arg Met Arg Glu Leu Ser Phe Leu Asn Lys Gly 

85 90 95 

ATT ACA ATC ACA CTT ACA GAT AAA AGA CAT ACT AAA GAC AAC GGC GAT 336 
lie Thr He Thr Leu Thr Asp Lys Arg His Thr Lys Asp Asn Gly Asp 
100 105 110 

45 TTT GAA GGT GAA GTT TTT CAT TCT AAA GAA GGG CTT AAA GAA TTC GTT 384 

Phe Glu Gly Glu Val Phe His Ser Lys Glu Gly Leu Lys Glu Phe Val 
115 120 125 

50 CGA TTT TTA GAT GCT GGT AGA GAA CCA ATT ATT TCT CAC GTA ATA AGC 432 

Arg Phe Leu Asp Ala Gly Arg Glu Pro He He Ser His Val He Ser 

55 
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130 135 140 

ATG GAG CAC GAA AAA GGA GAA GTT CCT GTT GAG GTT GCT CTT GTT TAC 480 
Met Glu His Glu Lys Gly Glu Val Pro Val Glu Val Ala Leu Val Tyr 
145 150 155 160 

AAT ACA AGT TAC TCC GAA AAT ATT TTC TCT TAC GTA AAT AAT ATT AAC 528 
Asn Thr Ser Tyr Ser Glu Asn lie Phe Ser Tyr Val Asn Asn He Asn 

165 170 175 

ACG CAC GAA 537 
Thr His Glu 



30 



35 



20 SEQ ID NO: 8 

SEQUENCE LENGTH: 179 
SEQUENCE TYPE: amino acid 
^ 5 TOPOLOGY: unknown 

MOLECULE TYPE: protein 
ORIGINAL SOURCE 

ORGANISM: Flavobacter ium. aquatile 
STRAIN: I AH 12315 
SEQUENCE DESCRIPTION 
Asp Lys Asp Ser Tyr Lys Val Ser Gly Gly Leu His Gly Val Gly Val 
15 10 15 

to Ser Cys Val Asn Ala Leu Ser Asp Asn Leu Lys Ala Thr Val Phe Arg 

•20 25 30 

Asp Gly Lys Val Tyr Glu Gin Glu Tyr Glu Lys Gly Lys Ala Met Tyr 
& 35 40 45 

Pro Val Lys Gin Val Gly Glu Thr Thr Lys Arg Gly Thr Met Val Thr 

50 55 60 

Phe His Pro Asp Lys Thr He Phe Thr Gin Thr He Glu Tyr Ser Tyr 
65 70 75 80 
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Asp Thr Leu Ala Ala Arg Met Arg Glu Leu Ser Phe Leu Asn Lys Gly 

85 90 95 

lie Thr lie Thr Leu Thr Asp Lys Arg His Thr Lys Asp Asn Gly Asp 

100 105 110 

Phe Glu Gly Glu Val Phe His Ser Lys Glu Gly Leu Lys Glu Phe Val 

115 120 125 

Arg Phe Leu Asp Ala Gly Arg Glu Pro lie lie Ser His Val He Ser 

130 135 140 

Met Glu His Glu Lys Gly Glu Val Pro Val Glu Val Ala Leu Val Tyr 
145 150 155 160 



Asn Thr Ser Tyr Ser Glu Asn He Phe Ser Tyr Val Asn Asn He Asn 
165 170 175 

Thr His Glu 



SBQ ID NO: 9 
SEQUENCE LENGTH: 783 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: genomic DNA 
ORIGINAL SOURCE 

ORGANISM: Mycobacterium asiaticun 

STRAIN: ATCC 25274 
SEQUENCE DESCRIPTION 

GGC GAG AAC AGC GGC TAC ACC GTC AGC GGT GGG TTG CAC GGA GTG GGC 
Gly Glu Asn Ser Gly Tyr Thr val Ser Gly Gly Leu His Gly Val Gly 

1 5 10 15 

GTG TCG GTG GTC AAC GCG CTG TCC ACC CGC CTG GAG GTC ACC ATC AAG 
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Val Ser Val Val Asn Ala Leu Ser Thr Arg Leu Glu Val Thr lie Lys 

20 25 30 

CGC GAC GGG CAC GAG TGG TTT CAG TAC TAC GAC CGC GCC GTG CCC GGA 144 
Arg Asp Gly His Glu Trp Phe Gin Tyr Tyr Asp Arg Ala Val Pro Gly 

35 40 45 

ACC CTC AAG CAG GGC GAG GCC ACC AAG AAG ACC GGA ACC ACG ATC AGG 192 
Thr Leu Lys Gin Gly Glu Ala Thr Lys Lys Thr Gly Thr Thr lie Arg 

50 55 60 

TTC TGG GOG GAC CCC GAA ATC TTC GAA ACC ACA CAG TAC GAC TTC GAG 240 
Phe Trp Ala Asp Pro Glu He Phe Glu Thr Thr Gin Tyr Asp Phe Glu 
65 70 75 80 

ACC GTG GCG CGG CGG CTG CAG GAG ATG GCC TTC CTC AAC AAG GGC CTC 288 
Thr Val Ala Arg Arg Leu Gin Glu Met Ala Phe Leu Asn Lys Gly Leu 

85 90 95 

ACC ATC AAC CTC ACC GAC GAA CGA GTG GAG CAG GAC GAG GTC GTC GAC 336 
Thr He Asn Leu Thr Asp Glu Arg Val Glu Gin Asp Glu Val Val Asp 

100 105 110 

GAG GTC GTC AGC GAC ACC GCC GAG GCA CCG AAG TCC GCC GAA GAG AAG 384 
Glu Val Val Ser Asp Thr Ala Glu Ala Pro Lys Ser Ala Glu Glu Lys 

115 120 125 

GCC GCG GAA TCG ACT GCG CCA CAC AAG GTC AAG CAC CGC ACC TTC CAC 432 
Ala Ala Glu Ser Thr Ala Pro His Lys Val Lys His Arg Thr Phe His 

130 135 140 

TAC CCC GGC GGT CTG GTC GAC TTC GTC AAG CAC ATC AAC CGC ACC AAG 480 
Tyr Pro Gly Gly Leu Val Asp Phe Val Lys His He Asn Arg Thr Lys 
145 150 155 160 

AGC CCG ATC CAG CAG AGC GTC ATC GAT TTC GAC GGC AAG GGC ACC GGC 528 
Ser Pro He Gin Gin Ser Val He Asp Phe Asp Gly Lys Gly Thr Gly 
165 170 175 
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CAC GAG GTC GAG ATC GCC ATG CAG TGG AAC GGC GGC TAC TCG GAG TCC 576 
His Glu Val Glu lie Ala Met Gin Trp Asn Gly Gly Tyr Ser Glu Ser 

180 185 190 

GTC CAC ACC TTC GCC AAC ACC ATC AAC ACG CAC GAG GGC GGC ACC CAC 624 
Val His Thr Phe Ala Asn Thr He Asn Thr His Glu Gly Gly Thr His 

195 200 205 

GAG GAG GGC TTC CGC AGC GCG CTG ACG TCG GTG GTG AAC AAG TAC GCC 672 
15 Glu Glu Gly Phe Arg Ser Ala Leu Thr Ser Val Val Asn Lys Tyr Ala 

210 215 220 

AAA GAC AAG AAA CTG CTG AAG GAC AAA GAT CCC AAC CTC ACC GGT GAC 720 
Lys Asp Lys Lys Leu Leu Lys Asp Lys Asp Pro Asn Leu Thr Gly Asp 
225 230 235 240 

GAC ATC CGT GAG GGC TTG GCC GCG GTC ATC TOG GTG AAG GTC GCC GAG 768 
Asp He Arg Glu Gly Leu Ala Ala Val He Ser Val Lys Val Ala Glu 

245 250 255 

CCA CAG TTC GAA GGC 783 
Pro Gin Phe Glu Gly 
260 

35 

SEQ ID NO: 10 

SEQUENCE LENGTH: 261 
40 SEQUENCE TYPE: amino acid 

TOPOLOGY: unknown 

MOLECULE TYPE: protein 
45 ORIGINAL SOURCE 

ORGANISM: Hycobacteriu* asiaticv.* 
STRAIN: ATCC 25274 
50 SEQUENCE DESCRIPTION 

Gly Glu Asn Ser Gly Tyr Thr Val Ser Gly Gly Leu His Gly Val Gly 
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15 10 15 

Val Ser Val Val Asn Ala Leu Ser Thr Arg Leu Glu Val Thr lie Lys 

20 25 30 

Arg Asp Gly His Glu Trp Phe Gin Tyr Tyr Asp Arg Ala Val Pro Gly 

35 40 45 

Thr Leu Lys Gin Gly Glu Ala Thr Lys Lys Thr Gly Thr Thr He Arg 

50 55 60 

Phe Trp Ala Asp Pro Glu He Phe Glu Thr Thr Gin Tyr Asp Phe Glu 
65 70 75 80 

Thr Val Ala Arg Arg Leu Gin Glu Met Ala Phe Leu Asn Lys Gly Leu 

85 90 95 

Thr lie Asn Leu Thr Asp Glu Arg Val Glu Gin Asp Glu Val Val Asp 

100 105 110 

Glu Val Val Ser Asp Thr Ala Glu Ala Pro Lys Ser Ala Glu Glu Lys 

115 120 125 

Ala Ala Glu Ser Thr Ala Pro His Lys val Lys His Arg Thr Phe His 

130 135 140 

Tyr Pro Gly Gly Leu Val Asp phe val Lys His He Asn Arg Thr Lys 
145 150 155 160 

Ser Pro He Gin Gin Ser Val He Asp Phe Asp Gly Lys Gly Thr Gly 

165 170 175 

His Glu Val Glu He Ala Met Gin Trp Asn Gly Gly Tyr Ser Glu Ser 

180 185 190 

Val His Thr Phe Ala Asn Thr He Asn Thr His Glu Gly Gly Thr His 

195 200 205 

Glu Glu Gly Phe Arg Ser Ala Leu Thr Ser Val Val Asn Lys Tyr Ala 

210 215 220 

Lys Asp Lys Lys Leu Leu Lys Asp Lys Asp Pro Asn Leu Thr Gly Asp 
225 230 235 240 
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Asp He Arg Glu Gly Leu Ala Ala Val He Ser Val Lys Val Ala Glu 
245 250 255 

5 

Pro Gin Phe Glu Gly 
260 

10 

SEQ ID NO: 11 
SEQUENCE LENGTH: 195 
15 SEQUENCE TYPE: nucleic acid 

STRANDEDNESS: double 
TOPOLOGY: linear 
20 MOLECULE TYPE: genomic DNA 

ORIGINAL SOURCE 

ORGANISM: Cytophaga lytica 
STRAIN: IPO 16020 
SEQUENCE DESCRIPTION 

AGC CAC ATT GAA ACT TTA ATT CTT ACA TTC TTC TTC CGT TTT ATG CGA 48 
Ser His He Glu Thr Leu He Leu Thr Phe Phe Phe Arg Phe Met Arg 

15 10 15 

GAA CTA ATA GAA GGC GGA CAC GTT TAC ATA GCA ACA CCA CCT TTA TAT 96 
Glu Leu He Glu Gly Gly His Val Tyr He Ala Thr Pro Pro Leu Tyr 
20 25 30 

40 TTA GTT AAA AAA GGA ACT AAA AAG CGT TAT GCT TGG AAT GAT AAA GAA 144 

Leu Val Lys Lys Gly Thr Lys Lys Arg Tyr Ala Trp Asn Asp Lys Glu 

35 40 45 

CGA GAT GAA ATA GCA GAT AGC TTT AAT GGT AGT GTA GGT ATC CAA AGA 192 
Arg Asp Glu He Ala Asp Ser Phe Asn Gly Ser Val Gly He Gin Arg 

50 55 60 

TAT 195 
Tyr 

55 
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SEQ ID NO: 12 
SEQUENCE LENGTH: 65 
SEQUENCE TYPE: amino acid 
TOPOLOGY: unknown 
MOLECULE TYPE: protein 
ORIGINAL SOURCE 

ORGANISM: Cytophaga lytica 

STRAIN: IFO 16020 
SEQUENCE DESCRIPTION 

Ser His lie Glu Thr Leu He Leu Thr Phe Phe Phe Arg Phe Met Arg 

15 10 15 

Glu Leu He Glu Gly Gly His val Tyr lie Ala Thr Pro Pro Leu Tyr 

20 25 30 

Leu Val Lys Lys Gly Thr Lys Lys Arg Tyr Ala Trp Asn Asp Lys Glu 

35 40 45 

Arg Asp Glu He Ala Asp Ser Phe Asn Gly Ser Val Gly He Gin Arg 
50 55 60 

Tyr 
65 



SEQ ID NO: 13 
SEQUENCE LENGTH: 1170 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: genomic DNA 
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ORIGINAL SOURCE 

ORGANISM: Synechococcus sp. 
STRAIN: PPC 6301 
SEQUENCE DESCRIPTION 

GTG GTG GAC AAC GCC GTC GAC AAA GCC TTG GCG GGC TAC TGC AAT ACC 48 
Val Val Asp Asn Ala Val Asp Lys Ala Leu Ala Gly Tyr Cys Asn Thr 
15 10 15 

15 ATT GAT GTT CGT CTG CTC AAA GAC GGC TCC TGC CAA GTC ACC GAT AAC 96 

He Asp Val Arg Leu Leu Lys Asp Gly Ser Cys Gin Val Thr Asp Asn 
20 25 30 

20 GGT CGC GGC ATT CCC ACA GAT ATT CAC CCC CAA ACC GGG AAG TCT GCT 144 

Gly Arg Gly lie Pro Thr Asp lie His Pro Gin Thr Gly Lys Ser Ala 
35 40 45 

25 CTC GAA ACC GTG CTG ACG ATT CTG CAC GCG GGC GGC AAG TTT GGC GGT 192 

Leu Glu Thr Val Leu Thr He Leu His Ala Gly Gly Lys Phe Gly Gly 

50 55 60 

GGC GGT TAT AAG GTG TCG GGG GGT CTG CAC GGC GTC GGT GTG TCT GTC 240 
Gly Gly Tyr Lys Val Ser Gly Gly Leu His Gly Val Gly Val Ser Val 
65 70 75 80 

CTC AAC GCC CTC TCA GAA TAT GTC GAA GTC ACC GTG TGG CGG GAA GGC 288 
Val Asn Ala Leu Ser Glu Tyr Val Glu Val Thr Val Trp Arg Glu Gly 

85 90 95 

AAA ACC CAC CAA CAG CGC TTT GAA GAG GGC AAC CCG ATC GGG GAG TTG 336 
Lys Thr His Gin Gin Arg Phe Glu Gin Gly Asn Pro He Gly Glu Leu 
45 100 105 110 

CAA GTT GCC CCG GAT GCC GAC GAT CGC CGC GGG ACA CAA GTT CGT TTC 384 
Gin Val Ala Pro Asp Ala Asp Asp Arg Arg Gly Thr Gin Val Arg Phe 
50 115 120 125 

AAA CCA GAC GCC ACG ATC TTT TCT GAA ACA ACC GAG TTC GAT TAC GGC 432 
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Lys Pro Asp Ala Thr lie Phe Ser Glu Thr Thr Glu Phe Asp Tyr Oly 

130 135 140 

ACC CTA GCA AGC CGA TTG AAG GAG CTA GCC TAT CTG AAT GCG GGC GTC 480 
Thr Leu Ala Ser Arg Leu Lys Glu Leu Ala Tyr Leu Asn Ala Gly Val 
145 150 155 160 

CGC ATC GAC TTT ACC GAT GAG CGG CTG CAG CTC ACC AAG AAT CAC GAG 528 
Arg lie Asp Phe Thr Asp Glu Arg Leu Gin Leu Thr Lys Asn His Glu 

165 170 175 

CCC CAT CAA GAA ACC TAT TAC TTT GAA GGC GGT ATT CGC GAA TAC GTC 576 
Pro His Gin Glu Thr Tyr Tyr Phe Glu Gly Gly He Arg Glu Tyr Val 

180 185 190 

GCC TAC ATG AAT ACC GAT AAA CAG GCG CTG CAC TCA GAG ATT ATC TTT 624 
Ala Tyr Met Asn Thr Asp Lys Gin Ala Leu His Ser Glu He lie Phe 

195 200 205 

GTG CAA TCC GAA AAA GAT GGC GTC CAA GTT GAA GCT GCA TTG CAA TGG 672 
Val Gin Ser Glu Lys Asp Gly Val Gin Val Glu Ala Ala Leu Gin Trp 

210 215 220 

TGC GTT GAC GCC TAC AGC GAC AAC ATT CTG GGC TTT GCC AAC AAC ATC 720 
Cys Val Asp Ala Tyr Ser Asp Asn He Leu Gly Phe Ala Asn Asn He 
225 230 235 240 

CGC ACG ATT GAC GGC GGC ACC CAT ATT GAG GGG CTC AAA ACT GTT CTG 768 
Arg Thr He Asp Gly Gly Thr His He Glu Gly Leu Lys Thr Val Leu 

245 250 255 

ACG CGG ACG ATG AAC ACG ATC GCC CGC AAA CGG AAT AAA CGC AAG GAT 816 
Thr Arg Thr Met Asn Thr He Ala Arg Lys Arg Asn Lys Arg Lys Asp 

260 265 270 

GCC GAC AAT AAC CTG TCG GGC GAG AAT ATT CGC GAA GGG TTA ACA GCG 864 
Ala Asp Asn Asn Leu Ser Gly Glu Asn lie Arg Glu Gly Leu Thr Ala 
275 280 285 
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ATC GTT TCG GTC AAA GTT CCG GAT CCG GAA TTT 
He Val Ser Val Lys Val Pro Asp Pro Glu Phe 

290 295 
ACA AAG CTC GGC AAT ACC GAA GTT CGC GGC ATC 
Thr Lys Leu Gly Asn Thr Glu Val Arg Gly He 
305 310 315 

GGC GAA ACG TTG ACG GAA TAT CTG GAA TTC CAT 
Gly Glu Thr Leu Thr Glu Tyr Leu Glu Phe His 

325 330 
TTG ATC CTC GAA AAA GCG ATT CAA GCC TTT AAT 
Leu He Leu Glu Lys Ala He Gin Ala Phe Asn 

340 345 
CGA CGG GCA CGG GAA TTG GTG CGT CGC AAA TCA 
Arg Arg Ala Arg Glu Leu Val Arg Arg Lys Ser 

355 360 
ACA TTG CCC GGT AAA TTA GCA GAC TGT TCC AGT 
Thr Leu Pro Gly Lys Leu Ala Asp Cys Ser Ser 

370 375 
TCT GAA ATC TTC ATC GTG 
Ser Glu lie Phe He Val 
385 390 

SEQ ID NO: 14 
SEQUENCE LENGTH: 390 
SEQUENCE TYPE: amino acid 
TOPOLOGY: unknown 
MOLECULE TYPE: protein 
ORIGINAL SOURCE 

ORGANISM: Synechococcus sp. 



GAA GGG CAA ACC AAA 912 
Glu Gly Gin Thr Lys 
300 

GTC GAT ACG CTC GTG 960 
Val Asp Thr Leu Val 
320 

CCC AGC GTT GCC GAT 1008 
Pro Ser Val Ala Asp 
335 

GCG GCT GAG GCA GCG 1056 
Ala Ala Glu Ala Ala 
350 

GTG CTG GAA TCT TCG 1104 
Val Leu Glu Ser Ser 
365 

CGC GAT CCC GGT GAA 1152 
Arg Asp Pro Gly Glu 
380 

1170 
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STRAIN: PCC 6301 
SEQUENCE DESCRIPTION 

Val Val Asp Asn Ala Val Asp Lys Ala Leu Ala Gly Tyr Cys Asn Thr 

15 10 15 

lie Asp Val Arg Leu Leu Lys Asp Gly Ser Cys Gin Val Thr Asp Asn 

20 25 30 

Gly Arg Gly He Pro Thr Asp He His Pro Gin Thr Gly Lys Ser Ala 

35 40 45 

Leu Glu Thr Val Leu Thr He Leu His Ala Gly Gly Lys Phe Gly Gly 

50 55 60 

Gly Gly Tyr Lys Val Ser Gly Gly Leu His Gly Val Gly Val Ser Val 
65 70 75 80 

Val Asn Ala Leu Ser Glu Tyr Val Glu Val Thr Val Trp Arg Glu Gly 

85 90 95 

Lys Thr His Gin Gin Arg Phe Glu Gin Gly Asn Pro He Gly Glu Leu 

100 105 110 

Gin Val Ala Pro Asp Ala Asp Asp Arg Arg Gly Thr Gin Val Arg Phe 

115 120 125 

Lys Pro Asp Ala Thr He Phe Ser Glu Thr Thr Glu Phe Asp Tyr Gly 

130 135 140 

Thr Leu Ala Ser Arg Leu Lys Glu Leu Ala Tyr Leu Asn Ala Gly Val 
145 150 155 160 

Arg He Asp Phe Thr Asp Glu Arg Leu Gin Leu Thr Lys Asn His Glu 

165 170 175 

Pro His Gin Glu Thr Tyr Tyr Phe Glu Gly Gly He Arg Glu Tyr Val 

180 185 190 

Ala Tyr Met Asn Thr Asp Lys Gin Ala Leu His Ser Glu He He Phe 

195 200 205 

Val Gin Ser Glu Lys Asp Gly Val Gin Val Glu Ala Ala Leu Gin Trp 
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210 215 220 

Cys Val Asp Ala Tyr Ser Asp Asn lie Leu Gly Phe Ala Asn Asn He 
225 230 235 240 

Arg Thr He Asp Gly Gly Thr His He Glu Gly Leu Lys Thr Val Leu 

245 250 255 

Thr Arg Thr Met Asn Thr He Ala Arg Lys Arg Asn Lys Arg Lys Asp 

260 265 270 

Ala Asp Asn Asn Leu Ser Gly Glu Asn He Arg Glu Gly Leu Thr Ala 

275 280 285 

He Val Ser Val Lys Val Pro Asp Pro Glu Phe Glu Gly Gin Thr Lys 

290 295 300 

Thr Lys Leu Gly Asn Thr Glu Val Arg Gly He Val Asp Thr Leu Val 
305 310 315 320 

Gly Glu Thr Leu Thr Glu Tyr Leu Glu Phe His Pro Ser Val Ala Asp 

325 330 335 

Leu He Leu Glu Lys Ala He Gin Ala Phe Asn Ala Ala Glu Ala Ala 

340 345 350 

Arg Arg Ala Arg Glu Leu Val Arg Arg Lys Ser Val Leu Glu Ser Ser 

355 360 365 

Thr Leu Pro Gly Lys Leu Ala Asp Cys Ser Ser Arg Asp Pro Gly Glu 

370 375 380 

Ser Glu He Phe He Val 
385 390 

SBQ ID NO: 15 
SEQUENCE LENGTH: 696 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
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MOLECULE TYPE: genomic DNA 
ORIGINAL SOURCE 

ORGANISM: Caulobacter crescentus 
STRAIN: ATCC 15252 
SEQUENCE DESCRIPTION 

CAG AAC AGC TAC AAG GTC TCG GGC GGT CTG CAC GGC GTG GGC GTC TCG 48 
Gin Asn Ser Tyr Lys Val Ser Gly Gly Leu His Gly val Gly Val Ser 

15 10 15 

GTC GTG AAC GCC CTG TCG GAT TGG CTG GAG CTG CTG ATC CAC CGC AAC 96 
val val Asn Ala Leu Ser Asp Trp Leu Glu Leu Leu lie His Arg Asn 
20 20 25 30 

GGC AAG GTC CAC CAG ATG CGC TTC GAG CGC GGC GAC GCG GTC ACC TCG 144 
Gly Lys Val His Gin Met Arg Phe Glu Arg Gly Asp Ala Val Thr Ser 

35 40 45 

CTG AAG GTC ACC GGC GAC TCG CCC GTG CGG ACC GAG GGC CCC AAG GCC 192 
Leu Lys Val Thr Gly Asp Ser Pro Val Arg Thr Glu Gly Pro Lys Ala 

50 55 60 

GGC GAG ACC CTG ACC GGT ACG GAA GTT ACG TTC TTT CCG TCG AAG GAC 240 
35 Gly Glu Thr Leu Thr Gly Thr Glu Val Thr Phe Phe Pro Ser Lys Asp 

65 70 75 80 

ACC TTC GCC TTC ATC GAA TTC GAC CGG AAG ACG CTG GAG CAC CGC CTG 288 
40 Thr Phe Ala Phe lie Glu Phe Asp Arg Lys Thr Leu Glu His Arg Leu 

85 90 95 

CGC GAG CTG GCC TTC CTG AAC TCG GGC GTG ACG ATC TGG TTC AAG GAC 336 
Arg Glu Leu Ala Phe Leu Asn Ser Gly Val Thr lie Trp Phe Lys Asp 

100 105 110 

CAT CGC GAC GTC GAG CCG TGG GAA GAG AAG CTG TTC TAC GAG GGC GGC 384 
His Arg Asp Val Glu Pro Trp Glu Glu Lys Leu Phe Tyr Glu Gly Gly 
115 120 125 
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ATC GAG GCC TTC GTG CGC CAC CTC GAC AAG GCC AAG ACG CCG CTG CTG 432 
lie Glu Ala Phe Val Arg His Leu Asp Lys Ala Lys Thr Pro Leu Leu 

130 135 140 

AAG GCC CCG ATC GCC GTC AAG GGC GTC AAG GAC AAG GTC GAG ATC GAC 480 
Lys Ala Pro He Ala Val Lys Gly Val Lys Asp Lys Val Glu He Asp 
145 150 155 160 

CTG GCC CTG TGG TGG AAC GAC AGC TAC CAC GAG CAG ATG CTG TGC TTC 528 
75 Leu Ala Leu Trp Trp Asn Asp Ser Tyr His Glu Gin Met Leu Cys Phe 

165 170 175 

ACC AAC AAC ATC CCG CAG CGG GAT GGC GGC ACG CAC CTG TCG GCC TTT 576 
20 Thr Asn Asn He Pro Gin Arg Asp Gly Gly Thr His Leu Ser Ala Phe 

180 185 190 

CGC GCG GCC CTG ACC CGG ATC ATC ACC AGC TAC GCC GAG AGC TCC GGC 624 

25 

Arg Ala Ala Leu Thr Arg He He Thr Ser Tyr Ala Glu Ser Ser Gly 

195 200 205 

ATC CTG AAG AAG GAA AAG GTC AGC CTG GGC GGC GAA GAC AGC CGC GAG 672 
He Leu Lys Lys Glu Lys Val Ser Leu Gly Gly Glu Asp Ser Arg Glu 

210 215 220 

GGC CTG ACC TGC GTG CTG TCG GTC 696 
Gly Leu Thr Cys Val Leu Ser Val 
225 230 

40 

SEQ ID NO: 16 
SEQUENCE LENGTH: 232 
45 SEQUENCE TYPE: amino acid 

TOPOLOGY: unknown 
MOLECULE TYPE: protein 

50 

ORIGINAL SOURCE 

ORGANISM: Caulobacter crescentus 
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STRAIN: ATCC 15252 
SEQUENCE DESCRIPTION 

Gin Asn Ser Tyr Lys Val Ser Gly Gly Leu His Gly Val Gly Val Ser 

15 10 15 

Val Val Asn Ala Leu Ser Asp Trp Leu Glu Leu Leu lie His Arg Asn 

20 25 30 

Gly Lys Val His Gin Met Arg Phe Glu Arg Gly Asp Ala Val Thr Ser 

35 40 45 

Leu Lys Val Thr Gly Asp Ser Pro Val Arg Thr Glu Gly Pro Lys Ala 

50 55 60 

Gly Glu Thr Leu Thr Gly Thr Glu Val Thr Phe Phe Pro Ser Lys Asp 
65 70 75 80 

Thr Phe Ala Phe He Glu Phe Asp Arg Lys Thr Leu Glu His Arg Leu 

85 90 95 

Arg Glu Leu Ala Phe Leu Asn Ser Gly Val Thr He Trp Phe Lys Asp 

100 105 110 

His Arg Asp Val Glu Pro Trp Glu Glu Lys Leu Phe Tyr Glu Gly Gly 

115 120 125 

He Glu Ala Phe Val Arg His Leu Asp Lys Ala Lys Thr Pro Leu Leu 

130 135 140 

Lys Ala Pro He Ala Val Lys Gly Val Lys Asp Lys Val Glu He Asp 
145 150 155 160 

Leu Ala Leu Trp Trp Asn Asp Ser Tyr His Glu Gin Met Leu Cys Phe 

165 170 175 

Thr Asn Asn He Pro Gin Arg Asp Gly Gly Thr His Leu Ser Ala Phe 

180 185 190 

Arg Ala Ala Leu Thr Arg He He Thr Ser Tyr Ala Glu Ser Ser Gly 

195 200 205 

He Leu Lys Lys Glu Lys Val Ser Leu Gly Gly Glu Asp Ser Arg Glu 
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SEQ ID NO: 17 
SEQUENCE LENGTH: 888 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY : linear 
MOLECULE TYPE: genomic DNA 
ORIGINAL SOURCE 

ORGANISM: Pseudovonas putida 

STRAIN: ATCC 17484 
SEQUENCE DESCRIPTION 

GGC GGC CTG CAC GGT GTA GGC GTG TCG GTA GTG AAC GCA CTG TCT GAA 48 
Gly Gly Leu His Gly Val Gly Val Ser Val Val Asn Ala Leu Ser Glu 

15 10 15 

GAG CTC GTC CTC ACC GTT CGC CGT AGC GGC AAG ATC TGG GAA CAG ACC 96 
Glu Leu Val Leu Thr Val Arg Arg Ser Gly Lys lie Trp Glu Gin Thr 



TAC GTC CAT GGT GTT CCG CAG GAA CCG ATG AAG ATC GTT GGC GAC AGC 144 
Tyr Val His Gly Val Pro Gin Glu Pro Met Lys lie Val Gly Asp Ser 



GAA ACC ACC GGC ACC CAG ATC CAC TTC AAG GCT TCC AGC GAA ACC TTC 192 
Glu Thr Thr Gly Thr Gin lie His Phe Lys Ala Ser Ser Glu Thr Phe 



AAG AAC ATC CAC TTC AGC TGG GAC ATC CTG GCC AAG CGG ATT CGT GAA 240 
Lys Asn He His Phe Ser Trp Asp He Leu Ala Lys Arg He Arg Glu 
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CTG TCC 

Leu Ser 

AGC GGC 
Ser Gly 

GTT GAA 
Val Glu 

TTC AAC 
Phe Asn 
130 
TGG AAC 
Trp Asn 
145 

CCG CAG 
Pro Gin 

ACG CGT 
Thr Arg 

CAC AAG 
His Lys 

ATC ATT 
He He 
210 
GAC AAG 
Asp Lys 



TTC CTC 
Phe Leu 

AAG GAA 
Lys Glu 
100 
TAC CTG 
Tyr Leu 
115 

ATC CAG 
He Gin 

GAC AGC 
Asp Ser 

CGC GAT 
Arg Asp 

AAC CTC 
Asn Leu 
180 
GTC GCG 
Val Ala 
195 

TCG GTA 
Ser Val 

CTG GTT 
Leu Val 



AAC TCC GGT GTC 
Asn Ser Gly Val 
65 

GAA CTG TTC AAG 
Glu Leu Phe Lys 



AAC ACC 
Asn Thr 

CGC GAA 
Arg Glu 

TTC AAC 
Phe Asn 
150 
GGC GGT 
Gly Gly 
165 

AAT ACG 
Asn Thr 



AAC AAG 
Asn Lys 
120 
GAC GGC 
Asp Gly 
135 

GAG AAC 
Glu Asn 

ACT CAC 
Thr His 

TAT ATC 
Tyr He 



ACC ACC GGT GAC 
Thr Thr Gly Asp 
200 

AAA GTG CCG GAT 
Lys Val Pro Asp 
215 

TCT TCC GAA GTG 
Ser Ser Glu Val 



GGC ATC GTC CTC AAG GAT GAG CGC 288 
Gly He Val Leu Lys Asp Glu Arg 

90 95 
TAC GAA GGC GGC TTG CGC GCG TTC 336 
Tyr Glu Gly Gly Leu Arg Ala Phe 
105 110 
ACC CCG GTC AAC CAG GTG TTC CAT 384 
Thr Pro Val Asn Gin Val Phe His 
125 

ATC GGC GTA GAA ATC GCC CTG CAG 432 
He Gly Val Glu He Ala Leu Gin 
140 

CTG TTG TGC TTC ACC AAC AAC ATT 480 
Leu Leu Cys Phe Thr Asn Asn He 
155 160 
CTG GTG GGT TTC CGT TCC GCC CTG 528 
Leu Val Gly Phe Arg Ser Ala Leu 

170 175 
GAA GCC GAA GGC CTG GCG AAG AAG 576 
Glu Ala Glu Gly Leu Ala Lys Lys 
185 190 
GAT GCC CGT GAA GGC CTG GCC GCG 624 
Asp Ala Arg Glu Gly Leu Ala Ala 
205 

CCG AAG TTC AGC TCC CAG ACC AAG 672 
Pro Lys Phe Ser Ser Gin Thr Lys 
220 

AAG ACC GCG GTC GAA CAG GAA ATG 720 
Lys Thr Ala Val Glu Gin Glu Met 
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225 230 235 240 

GGC AAG TAC TTC TCC GAC TTC CTG CTG GAA AAC CCG AAC GAA GCC AAG 768 
Gly Lys Tyr Phe Ser Asp Phe Leu Leu Glu Asn Pro Asn Glu Ala Lys 

245 250 255 

CTG GTT GTC GGC AAG ATG ATC GAC GCG GCA CGT GOT CGT GAA GCG GCG 816 
Leu Val Val Gly Lys Het He Asp Ala Ala Arg Ala Arg Glu Ala Ala 

260 265 270 

CGC AAG ACC CGT GAG ATG ACC CGC CGC AAA GGC GCG CTG GAC ATC GCC 864 
Arg Lys Thr Arg Glu Met Thr Arg Arg Lys Gly Ala Leu Asp He Ala 
275 280 285 

20 GGC CTG CCG GGC AAA CTG GCT GAC 888 

Gly Leu Pro Gly Lys Leu Ala Asp 
290 295 

SEQ ID NO: 18 
SEQUENCE LENGTH: 296 

30 

SEQUENCE TYPE: amino acid 
TOPOLOGY: unknown 
MOLECULE TYPE : protein 

35 

ORIGINAL SOURCE 

ORGANISM: Pseu&o&onas putida 
40 STRAIN: ATCC 17484 

SEQUENCE DESCRIPTION 

Gly Gly Leu His Gly Val Gly Val Ser Val Val Asn Ala Leu Ser Glu 

1 5 10 15 

Glu Leu Val Leu Thr Val Arg Arg Ser Gly Lys He Trp Glu Gin Thr 

20 25 30 

Tyr Val His Gly Val Pro Gin Glu Pro Met Lys He Val Gly Asp Ser 
35 40 45 
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Glu Thr Thr Gly Thr Gin He His Phe Lys Ala Ser Ser Glu Thr Phe 

50 55 60 

Lys Asn lie His Phe Ser Trp Asp He Leu Ala Lys Arg He Arg Glu 
65 70 75 60 

Leu Ser Phe Leu Asn Ser Gly Val Gly He Val Leu Lys Asp Glu Arg 

85 90 95 

Ser Gly Lys Glu Glu Leu Phe Lys Tyr Glu Gly Gly Leu Arg Ala Phe 

100 105 110 

Val Glu Tyr Leu Asn Thr Asn Lys Thr Pro Val Asn Gin Val Phe His 

115 120 125 

Phe Asn lie Gin Arg Glu Asp Gly He Gly Val Glu He Ala Leu Gin 

130 135 140 

Trp Asn Asp Ser Phe Asn Glu Asn Leu Leu Cys Phe Thr Asn Asn He 
145 150 155 160 

Pro Gin Arg Asp Gly Gly Thr His Leu Val Gly Phe Arg Ser Ala Leu 

165 170 175 

Thr Arg Asn Leu Asn Thr Tyr He Glu Ala Glu Gly Leu Ala Lys Lys 

180 185 190 

His Lys Val Ala Thr Thr Gly Asp Asp Ala Arg Glu Gly Leu Ala Ala 

195 200 205 

He He Ser Val Lys Val Pro Asp Pro Lys Phe Ser Ser Gin Thr Lys 

210 215 220 

Asp Lys Leu Val Ser Ser Glu Val Lys Thr Ala Val Glu Gin Glu Met 
225 230 235 240 

Gly Lys Tyr Phe Ser Asp Phe Leu Leu Glu Asn Pro Asn Glu Ala Lys 

245 250 255 

Leu Val Val Gly Lys Met He Asp Ala Ala Arg Ala Arg Glu Ala Ala 

260 265 270 

Arg Lys Thr Arg Glu Met Thr Arg Arg Lys Gly Ala Leu Asp He Ala 
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275 280 
Gly Leu Pro Gly Lys Leu Ala Asp 
290 295 



285 



10 



SEQ ID NO: 19 
SEQUENCE LENGTH: 531 
SEQUENCE TYPE: nucleic acid 
15 STRAND ED NESS : double 

TOPOLOGY: linear 
MOLECULE TYPE: genomic DNA 
20 ORIGINAL SOURCE 

ORGANISM: Synechocoecus sp. 
STRAIN: PCC 6301 
25 SEQUENCE DESCRIPTION 

TTG GTG CGT CGC AAA TCA GTG CTG GAA TCT TOG ACA TTG CCC GGT AAA 48 
Leu Val Arg Arg Lys Ser Val Leu Glu Ser Ser Thr Leu Pro Gly Lys 

15 10 15 

TTA GCA GAC TGT TCC AGT CGC GAT CCC GGT GAA TCT GAA ATC TTC ATC 96 
Leu Ala Asp Cys Ser Ser Arg Asp Pro Gly Glu Ser Glu He Phe He 

20 25 30 

GTG GAA GGG GAT TCG GCA GGT GGC AGT GCT AAA CAG GGG CGC GAT CGC 144 
ao Val Glu Gly Asp Ser Ala Gly Gly Ser Ala Lys Gin Gly Arg Asp Arg 

35 40 45 

CGC TTC CAA GCC ATC CTG CCT CTG CGC GGC AAA ATC CTC AAC ATC GAG 192 
45 Arg Phe Gin Ala He Leu Pro Leu Arg Gly Lys He Leu Asn He Glu 

50 55 60 

AAA ACG GAC GAT GCC AAA ATC TAC AAA AAC ACT GAG ATC CAA GCC CTG 240 

50 

Lys Thr Asp Asp Ala Lys He Tyr Lys Asn Thr Glu He Gin Ala Leu 
65 70 75 80 

55 
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ATT ACA GCG CTG GGC CTC GGA ATT AAA GGG GAG GAA TTT GAT GCT TCC 288 
lie Thr Ala Leu Gly Leu Gly He Lys Gly Glu Glu Phe Asp Ala Ser 

85 90 95 

CAA CTG CGC TAC CAC CGT ATT GTG ATC ATG ACT GAC GCG GAC GTC GAT 336 
Gin Leu Arg Tyr His Arg He Val He Met Thr Asp Ala Asp Val Asp 

100 105 110 

GGT GCG CAC ATC CGT ACC CTC TTG CTC ACC TTC TTC TAT CGC TAT CAG 384 
Gly Ala His He Arg Thr Leu Leu Leu Thr Phe Phe Tyr Arg Tyr Gin 

115 120 125 

CGA TCG CTG CTG GAG CAG GGC TAC ATG TAC ATT GCC TGC CCG CCG CTG 432 
Arg Ser Leu Leu Glu Gin Gly Tyr Met Tyr He Ala Cys Pro Pro Leu 

130 135 140 

TAC AAG TTG GAG CGG GGA CGT AAT CAC TAC TAT TGC TAC AAC GAA CGC 480 
Tyr Lys Leu Glu Arg Gly Arg Asn His Tyr Tyr Cys Tyr Asn Glu Arg 
145 150 155 160 

GAA CTG CAG GAA CGG ATT GCG ACG TTC CCT GAA AAC GCC AAC TAT ACG 528 
Glu Leu Gin Glu Arg He Ala Thr Phe Pro Glu Asn Ala Asn Tyr Thr 

165 170 175 

ATT 531 
He 



40 SBQ ID NO: 20 

SEQUENCE LENGTH: 177 
SEQUENCE TYPE: amino acid 

<s TOPOLOGY: unknown 

MOLECULE TYPE: protein 
ORIGINAL SOURCE 

50 

ORGANISM : Synechococcus sp. 
STRAIN: PCC 6301 
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SEQUENCE DESCRIPTION 

Leu Val Arg Arg Lys Ser Val Leu Glu Ser Ser Thr Leu Pro Gly Lys 

15 10 15 

Leu Ala Asp Cys Ser Ser Arg Asp Pro Gly Glu Ser Glu lie Phe lie 

20 25 30 

Val Glu Gly Asp Ser Ala Gly Gly Ser Ala Lys Gin Gly Arg Asp Arg 

35 40 45 

Arg Phe Gin Ala He Leu Pro Leu Arg Gly Lys He Leu Asn He Glu 

50 55 60 

Lys Thr Asp Asp Ala Lys He Tyr Lys Asn Thr Glu He Gin Ala Leu 
65 70 75 80 

He Thr Ala Leu Gly Leu Gly He Lys Gly Glu Glu Phe Asp Ala Ser 

85 90 95 

Gin Leu Arg Tyr His Arg He Val He Met Thr Asp Ala Asp Val Asp 

100 105 110 

Gly Ala His He Arg Thr Leu Leu Leu Thr Phe Phe Tyr Arg Tyr Gin 

115 120 125 

Arg Ser Leu Leu Glu Gin Gly Tyr Met" Tyr lie Ala Cys Pro Pro Leu 

130 135 140 

Tyr Lys Leu Glu Arg Gly Arg Asn His Tyr Tyr Cys Tyr Asn Glu Arg 
145 150 155 160 

Glu Leu Gin Glu Arg He Ala Thr Phe Pro Glu Asn Ala Asn Tyr Thr 
165 170 175 

He 
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SEQ ID NO: 21 
SEQUENCE LENGTH: 660 
SEQUENCE TYPE; nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
MOLECULE TYPE: genomic DNA 
ORIGINAL SOURCE 
J5 ORGANISM: Caulobacter crescentus 

STRAIN: ATCC 15252 
SEQUENCE DESCRIPTION 
20 CGG GAT GGC GGC ACG CAC CTG TCG GCC TTT OGC GCG GCC CTG ACC CGG 48 

Arg Asp Gly Gly Thr His Leu Ser Ala Phe Arg Ala Ala Leu Thr Arg 
15 10 15 

25 ATC ATC ACC AGC TAC GCC GAG AGC TCC GGC ATC CTG AAG AAG GAA AAG 96 

lie lie Thr Ser Tyr Ala Glu Ser Ser Gly lie Leu Lys Lys Glu Lys 

20 25 30 

GTC AGC CTG GGC GGC GAA GAC AGC CGC GAG GGC CTG ACC TGC GTG CTG 144 
Val Ser Leu Gly Gly Glu Asp Ser Arg Glu Gly Leu Thr Cys Val Leu 

35 40 45 

TCG GTC AAG GTC CCG GAT CCG AAG TTC AGC TOG CAG ACC AAG GAC AAG 192 
Ser Val Lys Val Pro Asp Pro Lys Phe Ser Ser Gin Thr Lys Asp Lys 

50 55 60 

CTG GTC TCG TCC GAA GTG CGC CCC GCC GTT GAG GGC CTG GTG TCG GAA 240 
Leu Val Ser Ser Glu Val Arg Pro Ala Val Glu Gly Leu Val Ser Glu 
45 65 70 75 80 

GGT CTC TCG ACC TGG TTC GAG GAA CAT CCG AAC GAG GCC AAG GCG ATC 288 
Gly Leu Ser Thr Trp Phe Glu Glu His Pro Asn Glu Ala Lys Ala He 

85 90 95 

GTG ACC AAG ATC GCC GAG GCC GCC GCC GCC CGC GAG GCC GCC CGC AAG 336 
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Val Thr Lys 

GCG CGA GAG 
Ala Arg Glu 
115 

CCC GGC AAG 
Pro Gly Lys 

130 
ATC TTC ATC 
lie Phe lie 
145 

CGC AAC CGC 
Arg Asn Arg 

AAC GTC GAG 
Asn Val Glu 

GGC ACG CTG 
Gly Thr Leu 
195 

AAC CCG GAC 
Asn Pro Asp 
210 



lie Ala Glu 
100 

CTG ACC CGC 
Leu Thr Arg 

CTC GCC GAC 
Leu Ala Asp 

GTC GAG GGC 
Val Glu Gly 
150 

GAC AAC CAG 
Asp Asn Gin 

165 
CGG GCC CGC 
Arg Ala Arg 
180 

ATC ACC GCC 
lie Thr Ala 

AAG GTG CGC 
Lys Val Arg 



Ala Ala Ala Ala 
105 

CGC AAG AGC GCG 
Arg Lys Ser Ala 

120 
TGC TCG 
Cys Ser 
135 

GAC TCG 
Asp Ser 



Arg Glu Ala Ala Arg Lys 
110 

ACC AGC CTG 384 
Thr Ser Leu 



GCC GTT 
Ala Val 

TTC GAC 
Phe Asp 

CTG GGC 
Leu Gly 
200 
TAC CAC 
Tyr His 
215 



GAA CGC 
Glu Arg 

GCG GGC 
Ala Gly 

CTG CCC 
Leu Pro 
170 
AAG ATG 
Lys Met 
185 

GCG GGG 
Ala Gly 

AAG ATC 
Lys lie 



CTC GAC ATC 
Leu Asp He 
125 

GAT CCG GCC 
Asp Pro Ala 

140 
GGC TCG GCC 
Gly Ser Ala 
155 

CTG CGC GGC 
Leu Arg Gly 

CTG TCG TCC 
Leu Ser Ser 

ATC GGC CGC 
He Gly Arg 
205 

GTG CTG 
val Leu 
220 



AAG TCC GAG 432 
Lys Ser Glu 

AAG CAG GCC 480 
Lys Gin Ala 
160 

AAG ATC CTG 528 
Lys He Leu 
175 

GAC CAG ATC 576 
Asp Gin He 
190 

GAC GAC TTC 624 
Asp Asp Phe 

660 



45 SEQ ID NO: 22 

SEQUENCE LENGTH: 220 
SEQUENCE TYPE: amino acid 

50 TOPOLOGY: unknown 

MOLECULE TYPE: protein 
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ORIGINAL SOURCE 

ORGANISM: Caulobacter crescentus 

STRAIN: ATCC 15252 
SEQUENCE DESCRIPTION 

Arg Asp Gly Gly Thr His Leu Ser Ala Phe Arg Ala Ala Leu Thr Arg 

15 10 15 

lie He Thr Ser Tyr Ala Glu Ser Ser Gly He Leu Lys Lys Glu Lys 

20 25 30 

Val Ser Leu Gly Gly Glu Asp Ser Arg Glu Gly Leu Thr Cys Val Leu 

35 40 45 

Ser Val Lys Val Pro Asp Pro Lys Phe Ser Ser Gin Thr Lys Asp Lys 

50 55 60 

Leu Val Ser Ser Glu Val Arg Pro Ala Val Glu Gly Leu Val Ser Glu 
65 70 75 80 

Gly Leu Ser Thr Trp Phe Glu Glu His Pro Asn Glu Ala Lys Ala He 

85 90 95 

Val Thr Lys He Ala Glu Ala Ala Ala Ala Arg Glu Ala Ala Arg Lys 

100 105 110 

Ala Arg Glu Leu Thr Arg Arg Lys Ser Ala Leu Asp He Thr Ser Leu 

115 120 125 

Pro Gly Lys Leu Ala Asp Cys Ser Glu Arg Asp Pro Ala Lys Ser Glu 

130 135 140 

He Phe He Val Glu Gly Asp Ser Ala Gly Gly Ser Ala Lys Gin Ala 
145 150 155 160 

Arg Asn Arg Asp Asn Gin Ala Val Leu Pro Leu Arg Gly Lys He Leu 

165 170 175 

Asn Val Glu Arg Ala Arg Phe Asp Lys Met Leu Ser Ser Asp Gin He 

180 185 190 

Gly Thr Leu He Thr Ala Leu Gly Ala Gly lie Gly Arg Asp Asp Phe 
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195 



200 



205 



Asn Pro Asp Lys Val Arg Tyr His Lys He Val Leu 



210 



215 



220 
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SEQ ID NO: 23 
SEQUENCE LENGTH: 1422 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
MOLECULE TYPE: genomic DNA 
ORIGINAL SOURCE 

ORGANISM: Cytophaga lytica 

STRAIN: MBIC 1544 
SEQUENCE DESCRIPTION 

GAT AAA GAT TCA TAC AAA GTA TCT GGT GGT TTA CAC GGT GTA GGT GTA 48 
Asp Lys Asp Ser Tyr Lys Val Ser Gly Gly Leu His Gly Val Gly Val 

15 10 15 

TCT TGT GTA AAC GCA TTA TCT AAT AAT TTA AAA GCT ACT GTT TAC AGA 96 
Ser Cys Val Asn Ala Leu Ser Asn Asn Leu Lys Ala Thr Val Tyr Arg 



GAA GGT AAA ATA TGG GAG CAA GAG TAT GAA AGA GGT AAG GCT TTA TAT 144 
Glu Gly Lys He Trp Glu Gin Glu Tyr Glu Arg Gly Lys Ala Leu Tyr 



CCG GTA AAA AGT ATT GGA GAA ACA GAG GAA ACA GGT ACT ATA GTT ACT 192 
Pro Val Lys Ser He Gly Glu Thr Glu Glu Thr Gly Thr He Val Thr 



TTT TAC CCA GAT GAT ACT ATA TTT ACA CAA ACT ACA GAG TAT AAT TAT 240 



50 



55 



60 



Phe Tyr Pro Asp Asp Thr He 



Phe Thr 



Gin Thr Thr Glu 



Tyr Asn *£yr 



65 



70 



75 



80 
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GAA ACG CTT TCT AAC AGA ATG CGA GAG TTG GCT TAC CTT AAT AAG GGA 288 
Glu Thr Leu Ser Asn Arg Met Arg Glu Leu Ala Tyr Leu Asn Lys Gly 

85 90 95 

GTT ACA ATT AGC ATT ACA GAT AAG AGA GTT AAA GAT GAA AAG GGA GAG 336 
Val Thr lie Ser lie Thr Asp Lys Arg Val Lys Asp Glu Lys Gly Glu 

100 105 110 

TTT TTA TCT GAA GTT TTT TAC TCT GAA GAA GGA CTA AAA GAA TTT ATT 384 
Phe Leu Ser Glu Val Phe Tyr Ser Glu Glu Gly Leu Lys Glu Phe lie 

115 120 125 

AAG TTT TTA GAC GGT AAC AGA GAA CAA CTA ATA CGT GAT GTT GTT TCA 432 
Lys Phe Leu Asp Gly Asn Arg Glu Gin Leu lie Arg Asp Val Val Ser 

130 135 140 

ATG GAA GGT GAA AAA AAC GGA ATT CCT GTT GAG GTT GCA ATG GTO TAC 480 
Met Glu Gly Glu Lys Asn Gly He Pro Val Glu Val Ala Met Val Tyr 
145 150 155 160 

AAT ACA TCA TAT TCA GAA AAT CTT CAC TCT TAC GTA AAT AAT ATT AAT 528 
Asn Thr Ser Tyr Ser Glu Asn Leu His Ser Tyr Val Asn Asn He Asn 

165 170 175 

ACA CAT GAA GGT GGT ACA CAC CTT TCA GGT TTT AGA AGA GGT TTA ACA 576 
Thr His Glu Gly Gly Thr His Leu Ser Gly Phe Arg Arg Gly Leu Thr 

180 185 190 

TCA ACC TTA AAA AAG TAT GCA GAT GCA TCT GGA ATG TTA GAC AAA TTA 624 
Ser Thr Leu Lys Lys Tyr Ala Asp Ala Ser Gly Met Leu Asp Lys Leu 

195 200 205 

AAG TTT GAG ATT CAG GGA GAT GAT TTT AGA GAA GGT TTA ACG GCT ATT 672 
Lys Phe Glu He Gin Gly Asp Asp Phe Arg Glu Gly Leu Thr Ala lie 

210 215 220 

GTG TCT GTT AAA GTT GCA GAA CCT CAG TTT GAA GGG CAA ACA AAA ACT 720 
Val Ser Val Lys Val Ala Glu Pro Gin Phe Glu Gly Gin Thr Lys Thr 
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225 230 235 240 

AAA TTA GGT AAC AGA GAA GTT TCT TCT GCA GTG AGC CAA GCT GTA TCA 768 
Lys Leu Gly Asn Arg Glu Val Ser Ser Ala Val Ser Gin Ala Val Ser 

245 250 255 

GAA ATG CTT ACC AAC TAT TTA GAA GAA AAC CCA GAT GAT GCT AAG GTA 816 
Glu Met Leu Thr Asn Tyr Leu Glu Glu Asn Pro Asp Asp Ala Lys Val 
260 265 270 

15 ATT GTA CAA AAA GTC ATT TTG GCA GCG CAA GCA CGT CAT GCG GCT ACA 864 

lie Val Gin Lys Val He Leu Ala Ala Gin Ala Arg His Ala Ala Thr 
275 280 285 

20 AAA GCC CGT GAA ATG GTA CAG CGT AAA ACG GTA ATG AGT ATA GGT GGT 912 

Lys Ala Arg Glu Met Val Gin Arg Lys Thr Val Met Ser He Gly Gly 

290 295 300 

TTA CCA GGG AAA TTA TCA GAC TGT TCT GAG CAA GAT GCT ACA AAA TGC 960 
Leu Pro Gly Lys Leu Ser Asp Cys Ser Glu Gin Asp Ala Thr Lys Cys 
305 310 315 320 

GAA GTA TTC CTT GTA GAG GGA GAT TCG GCG GGT GGT ACT GCT AAA CAA 1008 
Glu Val Phe Leu Val Glu Gly Asp Ser Ala Gly Gly Thr Ala Lys Gin 
x 325 330 335 

GGT AGG GAC AGA AAC TTT CAG GCA ATA TTA CCG CTT CGT GGT AAA ATC 1056 
Gly Arg Asp Arg Asn Phe Gin Ala He Leu Pro Leu Arg Gly Lys He 
40 340 345 350 

TTA AAT GTT GAA AAA GCA ATG CAA CAT AAG GTT TTT GAA AAC GAA GAA 1104 
Leu Asn Val Glu Lys Ala Met Gin His Lys Val Phe Glu Asn Glu Glu 

355 360 365 

ATA AAA AAT ATT TAG ACA GCT TTA GGT GTT ACT ATT GGT ACA GAA GAA 1152 
He Lys Asn He Tyr Thr Ala Leu Gly Val Thr He Gly Thr Glu Glu 

370 375 380 

GAT AGT AAA GCC TTA AAC TTA GAA AAA TTA AGA TAC CAT AAA GTA GTT 1200 
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Asp Ser Lys Ala Leu Asn Leu Glu Lys Leu Arg Tyr His Lys Val Val 
385 390 395 400 

ATT ATG TGT GAT GCC GAT GTA GAT GGT AGC CAC ATT GAA ACT TTA ATC 1246 
lie Met Cys Asp Ala Asp Val Asp Gly Ser His lie Glu Thr Leu lie 

405 410 415 

CTT ACA TTC TTC TTC CGT TTT ATG AGG GAG TTA ATA GAA GGC GGT CAC 1296 
Leu Thr Phe Phe Phe Arg Phe Met Arg Glu Leu He Glu Gly Gly His 

420 425 430 

GTT TAT ATA GCA ACC CCA CCT TTA TAC TTG GTA AAA AAG GGA ACA AAA 1344 
Val Tyr He Ala Thr Pro Pro Leu Tyr Leu Val Lys Lys Gly Thr Lys 

435 440 445 

AAA CGT TAT GCT TGG AAT GAT AAA GAA CGA GAT GAG ATA GCA GAA AGC 1392 
Lys Arg Tyr Ala Trp Asn Asp Lys Glu Arg Asp Glu He Ala Glu Ser 

450 455 460 

TTT AAT GGT AGT GTT GGT ATA CAA AGA TAT 1422 
Phe Asn Gly Ser Val Gly He Gin Arg Tyr 
465 470 



SEQ ID NO: 24 
SEQUENCE LENGTH: 474 
SEQUENCE TYPE: amino acid 
TOPOLOGY: unknown 
MOLECULE TYPE: protein 
ORIGINAL SOURCE 

ORGANISM: Cytopkttga lytica 

STRAIN: MBIC 1544 
SEQUENCE DESCRIPTION 

Asp Lys Asp Ser Tyr Lys Val Ser Gly Gly Leu His Gly Val Gly Val 
1 5 10 15 
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Ser Cys Val Asn Ala Leu Ser Asn Asn Leu Lys Ala Thr Val Tyr Arg 

20 25 30 

Glu Gly Lys lie Trp Glu Gin Glu Tyr Glu Arg Gly Lys Ala Leu Tyr 

35 40 45 

Pro Val Lys Ser lie Gly Glu Thr Glu Glu Thr Gly Thr He Val Thr 

50 55 60 

Phe Tyr Pro Asp Asp Thr He Phe Thr Gin Thr Thr Glu Tyr Asn Tyr 
65 70 75 80 

Glu Thr Leu Ser Asn Arg Met Arg Glu Leu Ala Tyr Leu Asn Lys Gly 

B5 90 95 

Val Thr He Ser He Thr Asp Lys Arg Val Lys Asp Glu Lys Gly Glu 

100 105 110 

Phe Leu Ser Glu Val Phe Tyr Ser Glu Glu Gly Leu Lys Glu Phe He 

115 120 125 

Lys Phe Leu Asp Gly Asn Arg Glu Gin Leu lie Arg Asp Val Val Ser 

130 135 140 

Met Glu Gly Glu Lys Asn Gly He Pro Val Glu Val Ala Met Val Tyr 
145 150 155 160 

Asn Thr Ser Tyr Ser Glu Asn Leu His Ser Tyr Val Asn Asn He Asn 

165 170 175 

Thr His Glu Gly Gly Thr His Leu Ser Gly Phe Arg Arg Gly Leu Thr 

180 185 190 

Ser Thr Leu Lys Lys Tyr Ala Asp Ala Ser Gly Met Leu Asp Lys Leu 

195 200 205 

Lys Phe Glu He Gin Gly Asp Asp Phe Arg Glu Gly Leu Thr Ala He 

210 215 220 

Val Ser Val Lys Val Ala Glu Pro Gin Phe Glu Gly Gin Thr Lys Thr 
225 230 235 240 

Lys Leu Gly Asn Arg Glu Val Ser Ser Ala Val Ser Gin Ala Val Ser 
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245 250 255 

Glu Met Leu Thr Asn Tyr Leu Glu Glu Asn Pro Asp Asp Ala Lys Val 

260 265 270 

lie Val Gin Lys Val lie Leu Ala Ala Gin Ala Arg His Ala Ala Thr 

275 280 285 

Lys Ala Arg Glu Met Val Gin Arg Lys Thr Val Met Ser lie Gly Gly 

290 295 300 

Leu Pro Gly Lys Leu Ser Asp Cys Ser Glu Gin Asp Ala Thr Lys Cys 
305 310 315 320 

Glu Val Phe Leu Val Glu Gly Asp Ser Ala Gly Gly Thr Ala Lys Gin 

325 330 335 

Gly Arg Asp Arg Asn Phe Gin Ala He Leu Pro Leu Arg Gly Lys lie 

340 345 350 

Leu Asn Val Glu Lys Ala Met Gin His Lys Val Phe Glu Asn Glu Glu 

355 360 365 

He Lys Asn He Tyr Thr Ala Leu Gly Val Thr He Gly Thr Glu Glu 

370 375 380 

Asp Ser Lys Ala Leu Asn Leu Glu Lys Leu Arg Tyr His Lys Val Val 
385 390 395 400 

He Met Cys Asp Ala Asp Val Asp Gly Ser His He Glu Thr Leu He 

405 410 415 

Leu Thr Phe Phe Phe Arg Phe Met Arg Glu Leu He Glu Gly Gly His 

420 425 430 

Val Tyr He Ala Thr Pro Pro Leu Tyr Leu Val Lys Lys Gly Thr Lys 

435 440 445 

Lys Arg Tyr Ala Trp Asn Asp Lys Glu Arg Asp Glu He Ala Glu Ser 

450 455 460 

Phe Asn Gly Ser Val Gly He Gin Arg Tyr 
465 470 
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SBQ ID NO: 25 
SEQUENCE LENGTH: 38 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

TGTAAAACGA CGGCCAGTCA YGCNGGNGGN AARTTYGA 

SBQ ID NO: 26 
SEQUENCE LENGTH: 7 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

His Ala Gly Gly Lys Phe Asp 

1 5 

SEQ ID NO: 27 
SEQUENCE LENGTH: 36 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: Single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

CTGCGTTCGT ATATGAGCNC CRTCNACRTC NGCRTC 

SEQ ID NO: 28 
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SEQUENCE LENGTH: 12 
SEQUENCE TYPE: amino acid 
TOPOLOGY": linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Asp Ala Asp Val Asp Gly Ala His He Arg Thr Leu 

15 10 

SEQ ID NO: 29 
SEQUENCE LENGTH: 41 
SEQUENCE TYPE: nucleic acid 
STHANDEDNESS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

GAAGTCATCA TGACCGTTCT GCAYGSNGGN GGNAARTTYG G 

SEQ ID NO: 30 
SEQUENCE LENGTH: 14 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Glu Val Leu Met Thr Val Leu His Ala Gly Gly Lys Phe Gly 

15 10 

SEQ ID NO: 31 
SEQUENCE LENGTH: 44 
SEQUENCE TYPE: nucleic acid 
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STRANDEDNESS : single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

AGCAGGGTAC GGATGTGCGA GCCRTCNACR TCNGCRTCNG TGAT 

SEQ ID NO: 32 

SEQUENCE LENGTH: 15 

SEQUENCE TYPE: amino acid 

TOPOLOGY: linear 

MOLECULE TYPE: peptide 

SEQUENCE DESCRIPTION 

Met Thr Asp Ala Asp Val Asp Gly Ser His lie Arg Thr Leu Leu 
15 10 15 

SEQ ID NO: 33 
SEQUENCE LENGTH: 32 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

CAGGAAACAG CTATGACCAR RTGNGTNCCN CC 

SEQ ID NO: 34 
SEQUENCE LENGTH: 5 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
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SEQUENCE DESCRIPTION 
Gly Gly Thr His Leu 
1 5 

SEQ ID NO: 35 
SEQUENCE LENGTH: 34 
SEQUENCE TYPE: nucleic acid 
STRAND EDNESS : single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

GCAACGAGAT CAACACTCMN GARGGNGGNA CNCA 

SEQ ID NO: 36 

SEQUENCE LENGTH: 11 

SEQUENCE TYPE: amino acid 

TOPOLOGY : linear 

MOLECULE TYPE: peptide 

SEQUENCE DESCRIPTION 

Asn Asn lie Asn Thr His Glu Sly Gly Thr His 
15 10 

SEQ ID NO: 37 
SEQUENCE LENGTH: 11 
SEQUENCE TYPE: amino acid 
TOPOLOGY; linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Asn Asn He Asn Thr Pro Glu Gly Gly Thr His 
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15 10 

SEQ ID NO: 36 
SEQUENCE LENGTH: 35 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

TGTAAAACGA CGGCCAGTAR YTTNKYYTTN GTYTG 

SEQ ID NO: 39 
SEQUENCE LENGTH: 6 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Gin Thr Lys Thr Lys Leu 

1 5 

SEQ ID NO: 40 
SEQUENCE LENGTH: 6 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Gin Thr Lys Asp Lys Leu 
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SEQ ID NO: 41 
SEQUENCE LENGTH: 35 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

TAGGCTAGCT GACCGTAAGA YGCNGAYRTN GAYGG 

SEQ ID NO: 42 
SEQUENCE LENGTH: 6 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Asp Ala Asp Val Asp Gly 

1 5 

SEQ ID NO: 43 
SEQUENCE LENGTH: 36 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

CCATAGCTGC GTAGCATTCA TYTCNCCNAR NCCYTT 

SEQ ID NO: 44 
SEQUENCE LENGTH: 12 
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SEQUENCE TYPE: amino acid 

TOPOLOGY: linear 

MOLECULE TYPE: peptide 

SEQUENCE DESCRIPTION 

Lys Gly Leu Gly Glu Met Asn Ala Thr Gin Leu Trp 
15 10 

SEQ ID NO: 45 
SEQUENCE LENGTH: 41 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

CAGGAAACAG CTATGACCAA RMGNCCNGSN ATGTAYATHG G 

SEQ ID NO: 46 
SEQUENCE LENGTH: 8 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Lys Arg Pro Ala Met Tyr He Gly 

1 5 

SEQ ID NO: 47 
SEQUENCE LENGTH: 8 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
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MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Lys Arg Pro Gly Met Tyr lie Gly 

1 5 

SEQ ID NO: 48 
SEQUENCE LENGTH: 38 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

TGTAAAACGA CGGCCAGTCC NCCNGCNSWR TCNCCYTC 

SEQ ID NO: 49 
SEQUENCE LENGTH: 7 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Glu Gly Asp Ser Ala Gly Gly 

1 5 

SEQ ID NO: 50 
SEQUENCE LENGTH: 39 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
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SEQUENCE DESCRIPTION 

TGTAAAACGA CGGCCAGTCA TNGTNGTNTC CCANARYTG 

SEQ ID NO: 51 
SEQUENCE LENGTH; 7 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Gin Leu Trp Glu Thr Thr Met 

1 5 

SEQ ID NO: 52 
SEQUENCE LENGTH: 7 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Gin Leu Trp Asp Thr Thr Met 

1 5 

SEQ ID NO: 53 
SEQUENCE LENGTH: 41 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DI 
SEQUENCE DESCRIPTION 

GAAGTCATCA TGACCGTTCT GCAYGCNGGN GGNAARTTYG A 
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SEQ ID NO: 54 
SEQUENCE LENGTH: 14 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Glu Val He Met Thr Val Leu His Ala Gly Gly Lys Phe Asp 
15 10 

SEQ ID NO: 55 
SEQUENCE LENGTH: 14 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Glu Val He Met Thr Val Leu His Ala Gly Gly Lys Phe Asn 

15 10 

SEQ ID NO: 56 
SEQUENCE LENGTH: 14 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Glu Val He Met Thr Val Leu His Ala Gly Gly Lys Phe Glu 

1 5 10 

SEQ ID NO: 57 
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SEQUENCE LENGTH: 14 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Glu Val lie Met Thr Val Leu His Ala Gly Gly Lys Phe Lys 

15 10 

SEQ ID NO: 58 
SEQUENCE LENGTH: 38 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

TGTAAAACGA CGGCCAGTGC NGGRTCYTTY TCYTGRCA 

SEQ ID NO: 59 
SEQUENCE LENGTH: 7 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Cys Gin Glu Lys Asp Pro Ala 

1 5 

SEQ ID NO: €0 
SEQUENCE LENGTH: 40 
SEQUENCE TYPE: nucleic acid 
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STRANDEDNESS : single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

GAAGTCATCA TGACCGTTCT GCAACNAAYA AYATHCCNCA 

SEQ ID NO: 61 
SEQUENCE LENGTH: 6 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Thr Asn Asn lie Pro Gin 

1 5 

SEQ ID NO: 62 
SEQUENCE LENGTH: 38 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

TGTAAAACGA CGGCCAGTAA YTTNGGNTCN GGNACYTT 

SEQ ID NO: 63 
SEQUENCE LENGTH: 7 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
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SEQUENCE DESCRIPTION 

Lys Val Pro Asp Pro Lys Phe 
1 5 

SEQ ID NO: 64 
SEQUENCE LENGTH: 7 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Lys Val Pro Glu Pro Lys Phe 

1 5 

SEQ ID NO: 65 
SEQUENCE LENGTH: 35 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid synthetic DNA 
SEQUENCE DESCRIPTION 

CAGGAAACAG CTATGACCGC NMRNMRNGCN MGNGA 

SEQ ID NO: 66 
SEQUENCE LENGTH: 6 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Ala Arg Arg Ala Arg Glu 
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1 5 

SEQ ID NO: 67 
SEQUENCE LENGTH: 6 
SEQUENCE TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Ala Arg Lys Ala Arg Glu 

1 5 



SEQ ID NO: 68 
SEQUENCE LENGTH: 6 
SEQUENCE TYPE; amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION 

Ala Lys Lys Ala Arg Glu 

1 5 



Claims 

1 . A method for identifying a microorganism comprising 

(i) amplifying DNAfrom the microorganism by PCR using two primers, wherein at least one of the say primers 
comprises sequence which codes for all or part of one of the following amino acid sequences (a) to (I): 

(a) (Pro or Ser)-(Ala or Thr)-<A!a t Val or LeuMGIu or Asp)-(VaI or Thr)-(lle or VaJ)-(Met, Leu or Phe)-7hr- 
(Val, Gin or lle)-Leu-His-A!a-Gly-Gly-Lys-Phe-(Asp or Gly)-(Asp, Gly, Asn or Ser)-(Ser, Lys. Gly, Asp or 
Asn) 

(b) Gly-Gly-Thr-His 

(c) (lie or Leu)-Met-Thr-Asp-Ala-Asp-Val-Asp-G!y-(AJa or Ser)-His-lle-Arg-Thr-Leu 

(d) Arg-Lys-Arg-Pro-(Gly or Ala)«Met-Tyr-lle-G!y«(Ser or Asp)-Thr 

(e) G!n-(Thr or ProMLys or Asn)-(Thr, Asp, Gly, Lys, Ser, Phe or Tyr)-Lys-Leu 
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(f) (Tyr or Phe)-Lys<3ly-Leu-Gly-Glu-Mel-Asn-(Ala or Pro) 

(g) Val-GIu-Gly-Asp-Ser-Ala-Gly-Gly-Ser 

(h) Lys-(His or VaQ-Pro-Asp-Pro-(Gln or Lys)-Phe 

(i) Leu-Pro-G!y-Lys-Leu-Ma-Asp-Cys-(Ser or Gln)-(Ser or Giu)-(Lys or Arg)-Asp-Pro-(AIa or Ser) 
(j) Gln-Leu-{Trp or ArgMGlu or Asp)-Thr-Thr-(Met or Leu) -(Asp or Asn)-Pro 

(k} Ala-(Lys or Arg)-(Lys or Arg)-Ala-Arg-Glu 
(I) Phe-Thr-Asn-Asn-He-(Pro or Asn)-(Thr or Gin) 

and which functions as a substantial primer; and 

(ii) identifying the microorganism based on the nucleotide sequence of the amplified DNA fragment 
A method for detecting a microorganism comprising 

(i) amplifying DNA from a sample by PGR using two primers, wherein at least one of the said primers com- 
prises sequence which codes for afl or a part of one of the following amino acid sequences (a) to (0: 

(a) (Pro or Ser)-(Ala or ThrMAia. Val or Leu)-(Glu or Asp)-<Val or Thr)-(lle or Va!)-(Met, Leu or Phe)-Thr- 
(Val, Gfln or lle)-Leu-His-AIa-Gly-Gly-Lys-Phe-(Asp or GIy)-(Asp, Gly, Asn or Ser)-(Ser, Lys, Gly. Asp or 
Asn) 

(b) Gly-Gty-Thr-His 

(c) (He or Leu)-Wlet-Thr-Asp-Ala-Asp-Val-Asp-Gly-(Ala or Ser)-His-lle-Arg-Thr-Leu 

(d) Arg-Lys-Arg-Pro-(Gly or Ala)-Met-Tyr-lle-Gly-(Ser or Asp)-Thr 

(e) GlrKThr or Pro)-(Lys or Asn)-(Thr, Asp, Gly, Lys, Ser, Phe or Tyr)-Lys-Leu 

(f) (Tyr or Phe)-Lys-Gly-Leu-Gly-Glu-Met-Asn-(Ala or Pro) 

(g) Val-Glu-Gly-Asp-Ser-Ala-Gly-Gly-Ser 

(h) Lys-(His or Val)-Pro-Asp-Pro-(Gln or Lys)-Phe 

(i) Leu-Pro-Gfy-Lys-Leu-Ala-Asp-Cys-(Ser or GlnHSer or GIu)-(Lys or Arg)-Asp-Pro-(Ala or Ser) 
Q) Gln-Leu-(Trp or ArgMGlu or Asp)-Thr-Thr-(Met or Leu)-(Asp or Asn)-Pro 

(^ Ala-(Lys or Arg)-(Lys or Arg)-Ala-Arg-Glu 
(I) Phe-Thr-Asn-Asn-lle-(Pro or Asn)-(Thr or Gin) 

and which functions as a substantial primer; and 

00 identifying whether the sample comprises a microorganism based on the nucleotide sequence of any ampli- 
fied DNA fragments. 

A method according to claims 1 or 2, wherein step (i) comprises: amplifying DNA from the microorganism or sam- 
ple by PCR using two primers, wherein: 

the f irst of the said primers comprises sequence which codes for all or part of one of the amino acid sequences 



the second of the said primers comprises sequence which is the reverse complement of sequence which 
codes for all or part of a different one of amino acid sequences (a) to 0). 

A method according to clam 3. wherein the first primer comprises sequence which codes for all or part of amino 
acid sequence (a) and the second primer comprises sequence which is the reverse complement of sequence 
which codes for all or part of amino acid sequence (b), (c). (e), (h), or (i). 



(a) to (I); and 
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FIG. 1 



GYRB BACSU 
GYRB ECOLI 
GYRB PSEPU 

GYRB BACSU 
GYRB ECOLI 
GYRB PSEPU 

GYRB BACSO 
GYRB ECOLI 
GYRB PSEPU 

GYRB BACSO 
GYRB ECOLI 
GYRB PSEPU 

GYRB BACSU 
GYRB ECOLI 
GYRB PSEPU 

GYRB BACSU 
GYRB ECOLI 
GYRB PSEPU 

GYRB BACSO 
GYRB ECOLI 
GYRB PSEPU 

GYRB BACSU 
GYRB ECOLI 
GYRB PSEPU 

GYRB BACSU 
GYRB ECOLI 
GYRB PSEPU 

GYRB BACSU 
GYRB ECOLI 
GYRB PSEPU 

GYRB BACSU 
GYRB ECOLI 
GYRB PSEPU 

GYRB BACSU 
GYRB ECOLI 
GYRB PSEPU 

GYRB BACSU 
GYRB ECOLI 
GYRB PSEPU 



(d) 

MEQQQNSYDE^IQVLEGLEA VRKRPGMYTGSTN S^KGLHHLVWEIVDNSIDEALAGYCT 

SNS YDSSS I KVLKGLDA VRKRPGMYIGDTD DGTGLHHMVrEVVDNAIDEALAGHCK 

-MSENQTYDSS S I KVLKGLDA VRKRPGMYIGDTD DGSGLHHMVFEVVDNS IDEALAGHCD 

(a) 

DINIQIEKDNSITWDNGRGI PVGlHEKMG RPAVEVIMTVLHAGGKrDGSG YKVSGGLBG 
EIIVTI HADKSVSVQDDGRGI PTGIHPEEG VSAAEVIMTVLHAGGKFDDNS YKVSGGLBG 
DITVI IHTDES ISVRDNGRGI PVDVHKEEGVSAAEVIMTVLHAGGKFDDNSYKVSGGLHG 

VGASVVNAI£TELDVTVHRIX3KIHR<7naCRGTO 
VGVSVVNALSQKLELVIQREGKIHRQIYEroVPQAPIA^ 

VGVSVVNALSEKLVLTVRRSGKIWEQTYVHGVPQAPMAWGESETTGra HFKPSAETFK 

ETTE YDYDLLANRVRELAFLTKGVNITI EDKREGQERKNEYHYEGGI KS YVEYLNRSKEV 
NVTEFEYEILAKRLRELSFLNSGVSIRLRDKRDGKE — DHFHYEGGIKAFVEYLNKNKTP 
N-IHFSWDILAKRIRELSFLNSGVGILLKDERSGKE — EFFKYEGGLRAFVEYLNTNKTP 

(1) (b) 

VHEEP I YI EGEK-DGITVEVALQYNDS YTSNI YS FTMNIHTY E GGTHE AGFKTGLTRVTN 
IHPNIFYFSTEK-DGIGVEVALQWNDGFQENIYCFTNOT 

VNSQVFHFSVQREDGVGVEVALQWNDSFNENLLCFTNNIPQRDGGTHLVGFRSSLTRSLN 

(h) (e) 

■ DYARKKGLI KENDPNLSGDPVREGLTAI IS I KHPDPQFEGQTKTKLG NSEARTITDTLFS 
AYMDKEGYSKKAKVSATGDDARSGLIAWS VKVPDPKFSSQTKDKLV SSEVKSAVEQQMN 
S YI EQEGLAKKNKVATTGDDAREGLTAI IS VKVPDP KFS SQTKDKLV SSEVKTAVBQEMN 

<M (i) 

TAMETFMIXNPDAAKKIVDKGLMAARARMAAK^^ 

ELLAE YLLENPTDAKI WGKI IDAARAREAARRAREMTRRKGALDLAGLPGKLADCQERD 
KYFSDFI^ENPNEAKAWGKMIDAARARE AARKAREM T^ 

(g) 

PSISELYI VEGDSAGGSA KQGRroHFQAILPLRGKILWVEKARLDKILSNNEVRSMITAL 
PALSELY LVEGPSAGGSA KQGRNRKKQAILPLKGKIL^ 

PALSELY LVEGDSAGGSA KQGRNRRTQAJLPLKGKILWVEKARFDKMISSQEVGTLITAL 
_ (C) 

GTGIGED-FNLEKARYHKV VIMTPADVDGAHIRTLL LTFFYRYMRQI IENGYVYTAQPPL 
GCGIGRDEYNPDKLRYHS I I IMTDADVDGSHIRTLL LTFFYRQMPEI VERGHVYIAQPPL 
GCGIGREEYNIDKLRYHNI I IMTDADVDGSHIRTLL LTFFFRQLPELVERGYIYIAQPPL 

YKVKKGKQBQYI KDDEAMDQYQI S I ALDGATLHTKAS APALAGEALEKLVS E YNATQKMI 
YKVKKGKQEQYT KDDEAMEEYMTQSALEDASLHLDESAPAVSGVQLESLVNEFRSVMKTL 



NRMERRYPKAMLKELIYQPTLTEADLSDEQTVTR&AWALVS 
KRLSRLYPEELTEHFVYLPEVTLEQLGDHAVMQAWLAKLQ^ 



AEQNLF^PIVRVRTHGVDTDYPLDHEFITGGEYRRICTLGEKLRGLLEEDAFIERGERRQ 
KE3WVV^PEVEITSHGLASYITFNRDFFGSNDYRTVVNIGAKLSSLLGEGAYVQRGERRK 

(f) (j) 

LEELLKTLPQTPKP — GLQ RYKGLGEMNATQLWETTMDPS SRTLLQVTLEDAMDAP 

P VAS FEQALDWLVK ESRRGLS IQ RYKGLGEMNPEQLWETTMDPE SRRMLRVTVKDAI AAD 
A I V E FKEG LDWLMN ETT KR HT IQR YKGLG EMNPDQLWETTMD PTVRRMLKVT I E DAI AAD 



GYRB BACSU ETFEMLMGDKVEPRRNFIEAHARYVKNLDI 
GYRB ECOLI QLFTTLMGDAVEPRRAFIEENALKAANIDI 
GYRB PSEPU QIFNTLMGDAVEPRREFIESNALSVSNLDF 
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A method for identifying a microorganism, com- 
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(i) amplifying DNA from the microorganism by PGR 
using two primers, wherein at least one of the said 
primers comprises sequence which codes for all or 
part of one of the following amino acid sequences 
(a) to (I): 

(a) (Pro or Ser)-(Ala or Thr)-(Ala. Val or Leu)- 
(Glu or Asp)-(Val or Thr)*(lle or Val)-(Met Leu 
or Phe)-Thr-(Val. Gin or lle)-Leu-His-AIa-Gly- 
Gly-Lys-Phe-(Asp or Gly)-(Asp, Gly. Asn or 
SerMSer, Lys, ay, Asp or Asn) 

(b) Gly-GIy-Thr-His 

(c) (He or Leu)-Met-Thr-Asp-Ala-Asp-Val-Asp- 
Gly-(Ala or Ser)-His- He- Arg-Thr-Leu 

(d) Arg-Lys-Arg-Pro-(Giy or Ala)-Met-Tyr-lle- 
Gly-(Ser or Asp)-Thr 

(e) Gln-(Thr or Pro)-(Lys or Asn)-(Thr, Asp, Gly, 
Lys, Ser, Phe or Tyr)-Lys-Leu 



(f) (Tyr or Phe)-Lys-Gly-Leu-Gly-Glu-Met-Asn- 
(AlaorPro) 

(g) Val-Glu-Gly-Asp-Ser-A!a-Gry-C3y-Ser 

(h) Lys-(His or Val)-Pro-Asp-Pro-(Gin or Lys)- 
Phe 

(i) Leu-Pro-Gly-Lys-Leu-Ala-Asp-Cys-(Ser or 
GlnMSer or Glu)-(Lys or Arg)-Asp-Pro-(Ala or 
Ser) 

(j) Gln-Leu-(Trp or Arg)-(Glu or Asp)-Thr-Thr- 

(Met or Leu)-(Asp or Asn)-Pro 

(k) AIa-(Lys or Arg)-(Lys or Arg)-Ala-Arg-Glu 

(I) Phe-Thr-Asn-Asn-lle-(Pro or Asn)-(Thr or 

Gin) 

and which functions as a substantial primer; and 
(ii) identifying the microorganism based on the 
nucleotide sequence of the amplified DNA frag- 
ment 
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